In Lab #4 you will use the scraping techniques outlined in class to identify and pull data from a site that has content of interest to you.
You need to use Node on repl.it in order to grab the data and generatr a CSV. That can be done by:
- Extracting tabular data from a single or multiple pages
- Extracting non-tabular data from a single or multiple pages
Each of these can be emulated emulating one of the following repl.it examples:
Single page table scraper
Multi-page table scraper
Note that this outputs two files to copy and paste.
Single page non-table scraper
Paged results non-table scraper
- Create a new repl.it Node.js project using this link
- Go to the page you’re using and “view source” in your browser.
- Choose the approprate type of script and customize it to fit your situation
- Once it’s working, export to github as a gist using the “share” button
- Submit the gist url to ICON
- Runs and generates csv output: 3 points
- Code is correct w/ good variable names: 5 points
- At least 10 meaningful comments: 2 point