Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculating route distance(driving as mode of transit) instead of Haversine distance #67

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ArunShresthaa
Copy link

@ArunShresthaa ArunShresthaa commented Apr 24, 2024

This PR is to implement the route distance with driving mode as means of transit than Haversine distance,
Route distance calculated using distancematrix.ai api (Kind of like google maps api to calculate the best route), allows 1000 requests for free

Added calculate_route_distance.py file which calculates the route distance for all the schools to all the centers and saves it in a distance.tsv file in results folder -- (because calculation of route distance is slow because of numerous api requests)
use accurate api for better accuracy (I used fast api for speed).

Added school_center_using_route_distance.py (to not affect the original file) to use previously calculated route distance while considering distance between schools and centers.

Result: while testing for small set of schools, it was found that the students allocated were less fragmented compared to using haversine distance (ss for reference)

Using Route Distance:
image

Using Haversine Distance
image

Copy link
Contributor

@LuluW8071 LuluW8071 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • It is better to use this approach instead of passing a demo API key. In the future, someone may push the code with the API intact.

    pip install python-dotenv

    Place the api key on .env file

    from dotenv import load_dotenv
    import os
    
    load_dotenv()
    
    api_key = os.getenv("API_KEY")
  • Also update the README.md and give API reference on where to obtain it.

Copy link
Contributor

@LuluW8071 LuluW8071 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Revert back to original name of school_center.py.

@ArunShresthaa
Copy link
Author

  • Revert back to original name of school_center.py.

Thanks for the guide, but this method is very slow because of the API call to be made, a total of 57,024 requests needs to be made for the given sample data, probably might take a whole day, so until this PR is merged I decided to go with a separate file.

Distancematrix.ai gives only 1000 requests for free, would be easy if used google maps API but its paid.

@sapradhan
Copy link
Collaborator

Students will need to travel from where they live. Distance from school was probably used because it the only feasible approximation. So I wonder whether a precise method like this is warranted here, especially given the cost of implementation. Also, this takes driving distance, a more appropriate measure would be route and availability of public transport.
Just offering a different perspective. This is an interesting solution none-the-less and thank you for contributing

@ArunShresthaa
Copy link
Author

ArunShresthaa commented Apr 25, 2024

Students will need to travel from where they live. Distance from school was probably used because it the only feasible approximation. So I wonder whether a precise method like this is warranted here, especially given the cost of implementation. Also, this takes driving distance, a more appropriate measure would be route and availability of public transport. Just offering a different perspective. This is an interesting solution none-the-less and thank you for contributing

The threshold distance is set 2KM, so public transport being not available is not much of issue

The speed of calculation can be speed up using google maps api which is much faster (paid) and distributing the task and running parallelly. If ran 10 processes simultaneously, its just 57,02.4 calls per process, which will be completed in approximately 1 hour and the calculation is to be done only once and can be reused.

@LuluW8071
Copy link
Contributor

The problem here is its just sample data here. More api calls need to be made for actual data and there are around 35000 (rough estimate) schools in Nepal. Don't think this approach is economically feasible while transport limitation is a factor in some rural areas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants