Skip to content

A script made to investigate crawling techniques using LinkedIn. In this case, collects data on a job search for entries containing a job poster.

License

Notifications You must be signed in to change notification settings

will-huynh/linkedin_jobs_crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 

Repository files navigation

linkedin_jobs_crawler

The linkedin_jobs_crawler is a Python web crawler script made to investigate crawling techniques using the website LinkedIn. In this case, the crawler searches for job postings (entries) containing a job poster and filters data relating to company name, job position, and job page link.

The crawler can be modified to run in a headless browser; it does not by default to leave use of login information to the user.

Installation:

Required Packages/Software:

The following is required to use this script:

Installing The Script:

  1. Clone the repository to your machine using git:

git clone https://github.com/will-huynh/linkedin_jobs_crawler.git

  1. Go to the cloned directory on your local machine and check for the latest version using git:

Navigate to the cloned linkedin_jobs_crawler folder

git branch master

git pull

  1. Download Chromedriver 2.41 and place the chromedriver executable file in the linkedin_jobs_crawler folder (the same directory as the script).

Using the crawler:

Use of the crawler is enabled by the command line. The crawler takes a query (job position), search location, and output file name with the .csv extension. The crawler then outputs scraped results to /<script_dir>/output/<csv_file>.

First, navigate to the script directory. The crawler is then run with a terminal command using three required arguments that must be passed to the crawler, specified by the following command and tags:

python3 linkedin_jobs_crawler.py

-k or --keyword ""

-l or --location ""

-o or --output "<csv_filename>"

Some example commands would be:

python3 -k "engineer" -l "Vancouver, Canada" -o "output.csv"

python3 --keyword "developer" --location "89143" --output "results.csv"

About

A script made to investigate crawling techniques using LinkedIn. In this case, collects data on a job search for entries containing a job poster.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages