Data analysis with Unix
How are Unix commands used for data analysis?
The quiz will cover the third lecture and the reading from Biostars chapter 18: Data analysis with Unix (pg 175)
The questions will ask you about the content of the file at
http://data.biostarhandbook.com/data/SGD_features.tab
Download this file onto your computer before venturing forth.
::: tip Additional information on the SGD_features.tab file can be found in http://data.biostarhandbook.com/data/SGD_features.README :::
Instructions
- For each question create a script file. For example
question_1.bash
,question_2.bash
, … - Place your script analyze the data and produce an output with the correct information.
- Push your code up to GitHub to receive feedback on your answers!
Questions
- How many lines does this file contain?
- How many lines match the pattern
gene
? - How many lines match the pattern
ORF
? - How many lines match the pattern
ORF
in the second column? - Which word of the second column appears
50
times? - The word
Z3_region
appears how many times in the second column? - How many features are located on the forward strand?
- How many features have no strand information listed?
- The standard gene name column lists each gene name only once:(True or False)
- More rows have feature types than feature names:(True or False)