2023 Class
On the UNIX command line. Go into your bigdata folder. If you have not used the cluster before then you will be in the gen220
project. Or you may have your own lab bigdata folder.
cd ~/bigdata # this should work but if it doesn't
cd /bigdata/gen220/$USER # will go into your bigdata folder for the class
# if the above doesn't work you are likely already in a lab group on HPCC
cd /bigdata/$GROUP/$USER # should work since $USER is your login and $GROUP is your primary labgroup
For your homework:
mkdir -p ~/bigdata/gen220/homework
and then cd ~/bigdata/gen220/homework
git clone https://github.com/biodataprog/GEN220_data.git
Now you want to make a folder for your work for this class
# you can make a folder for GEN220
mkdir gen220
go into that folder
cd gen220
# now use git to checkout the class data folder
git clone https://github.com/biodataprog/GEN220_data.git
# now go into this folder
cd GEN220_data
Look around in the folder. Go into the tabular
folder where I’ve stored some tab or comma delimited data. You will later need to copy a file from this folder into your homework folder.
git clone git@github.com:biodataprog/2023-hw1-YOURGITHUBID
OR for the https will need to create a token as your password)
git clone https://github.com/biodataprog/2023-hw1-YOURGITHUBID.git
cd 2023-hw1-YOURGITHUBID
).filesize.sh
; you can do this in jupyter on web, you can edit on the command line with nano
, vi
, or emacs
, or you can use ssh with aisual studio tunnel or./filesize.sh
).git commit
and then git push
# this step saves a version of the code
git commit -m "This is a homework 1 solution" filesize.sh
# this step will push the data from HPCC or your computer UP to the github site
# this step will request your username (YOURGITHUBID) and your password (that TOKEN I mentioned before).
# if you have setup github account with SSH keys then it will ask you for your SSH key password
git push
threatened-species.csv.gz
file - see info here HW1 or you can just run the included ./setup.sh
script to download. but also encourage you to practice with cp
command.filesize.sh
, that script should do the following:
du
or ls -l
gunzip
while leaving the original alone by adding the ‘-k’ option (gunzip -k threatened-species.csv.gz
), add a -f
option so it will overwrite the new file as well in case you run this more than one time.du
or ls -l
python -c "print($COMPRESSED / $UNCOMPRESSED)"
git commit -m 'a message'
and git push
to save the changes to github.