## 1/5/2018 1. ssh to login.cs.unc.edu, open two sessions. 2. Compare logs in * /afs/unc/proj/cismm/web_logs and * /net/www/var/log/httpd 3. Copy new logs from /net/www/var/log/httpd to /afs/unc/proj/cismm/web_logs. cp /net/www/var/log/httpd/cismm-data_access_log-2017* /afs/unc/proj/cismm/web_logs 4. From /afs/unc/proj/cismm, download the latest software_download_diagram_20xx.xls. 5. Edit /afs/unc/proj/cismm/count_downloads_scripts/by_year. Add a new year to each line where there is a list of years. 6. cd /afs/unc/proj/cismm/count_downloads_scripts. Run the corresponding script for each software listed in the excel sheet. (or make sure all scripts are in run_all.sh and run that script instead) 7. After each run, run ./by_year ~/tmp/cismm_xxx_total_list”, and look at the downloads in the desired year in the first section. Log files ========= The log files are located in two places: 1. www.cs.unc.edu/~nanowork 2. cismm.org The scripts used to count downloads from each location are different. list_xxx_download counts downloads from www.cs.unc.edu/~nanowork. cismm_list_xxx_download counts downloads from cismm. The main server was switched from the former to the latter in 2010, so the numbers from www.cs.unc.edu/~nanowork will not change. One thing worth noticng is that you SHOULD NOT count downloads from www.cs.unc.edu/~nanowork anymore, because the server only keeps 4~5 years of logs, and when a log is older than that, it gets deleted, which makes the log collection incomplete. So just use the number from last year for this part. For cismm.org logs, we need to get the up-to-date logs: 1. ssh cismm.org 2. sudo su 3. Logs in /var/log/httpd/ are text files and up to date. Compare the logs that you already have in /afs/unc/proj/stm/archive/www_hits/cismm.org_web_logs/ to the logs in /var/log/httpd/, and copy those access_log-xxx that you don't have. 4. cd /afs/unc/proj/stm/archive/www_hits/cismm.org_web_logs/ 5. "gzip access_log*" to compress those logs just becing copied over. 6. "cp /afs/unc/proj/stm/archive/www_hits/cismm.org_web_logs/access_log-2015*". I also keep a local copy of all the logs. "/afs/cs.unc.edu/home/phsiao/web_logs" is location where the scripts will check for logs. To get a new download number for a software tool: 1. Make sure the new logs are copied over (see above steps). 2. vi ./by_year. Add a new year to each line where you see a list of years. 3. Open a terminal: "./cismm_list_xxx_download" "./by_year ~/tmp/cismm_xxx_total_list" to count downloads from cismm.org. 4. Use the number from the "Total" section and copy it to the Excel sheet. The on-campus connection has been filtered out when collecting logs in the cismm_xxx_download scripts. Excel sheet =========== If a cell is a sum of two numbers, the first number is from the old server and should not be changed. Each cell in the Excel sheet is the sum of two numbers from both logs. Logs from www.cs.unc.edu/~nanowork only track downloads back to 5 years ago, so if a number is smaller, check /net/www/var/log/httpd to see if the logs are missing. Sometimes the files on the cismm website have different names, be sure to add new 'grep' command to include them. ========================================== Things to do if need to update the graph: ========================================== ImageSufer downloads are different. Before 2014 May 11, ImageSurfer 1 is downloaded from ImageSurfer.org, but logs from imagesurfer.org are only kept for a short time (a month), and we didn't save old logs, so there is no way to trace back downloads before 2014. After 2014, you only need to count the number of downloads from cismm.org because the download link on imagesurfer.org has been redirected to cismm so we won't lose them.