Socioeconomic dataset for analysing demand prediction of weekend markets in the city of Hamburg, Germany
In this DDLitlab funded Data Literacy student project, our goal was to predict weekend markets in the city of Hamburg and using open-source data and OpenStreetMaps in conjunction with Machine Learning Algorithms. You can find a brief article about the initial grant and our approach here : https://www.cliccs.uni-hamburg.de/about-cliccs/news/2023-news/2023-08-24-ddlitlab-event.html
Github repository: https://gitlab.rrz.uni-hamburg.de/exploring-avenues-for-the-deployment-of-machine-learning-algorithms-for-sustainable-small-agricultural-business-information-using-openstreetmap/main-project-v-3
This repository is intended to make our codes and visualisations openly available to the University of Hamburg students for further research. This is not to be used without citation under any circumstances and the University/authors deserve the right to withdraw consent at any time.
Please do not forget to cite our work in the event of fair use.
Organisation of our Github repository
Codes: contains the codes for the different methods deployed for data preparation,variable selection,visualisations showing the spatial characteristics of our variables, calculating indices such as correlation coefficients and machine learning methods in increasing order of complexity. City-district (Stadtteil) as the unit of analysis.
Data (uploaded datasets) : The open source data obtained for the project has been obtained from OpenStreetMaps (https://wiki.openstreetmap.org/wiki/Use_OpenStreetMap ) and Statistik Nord (https://www.statistik-nord.de/ ) . Each variable contains values for all stadtteils (city-districts) of Hamburg. The filenames are self explanatory.
The Hamburg shapefile has been obtained from Geofabrik https://www.geofabrik.de/de/data/shapefiles.html In addition to the original data uploaded in the section, we have also laid down the final data we have deployed with the algorithms, in the final final_data.csv
Our repository contains the following additional sections:
Results: This section contains results from the codes processed in the first section. It includes the final 10 variables selected for the study, the results from the VIF analysis, correlation matrix, and some model output statistics.
Visualisations: This section is dedicated to visualisations of the variables used for the study and the results from deployment of various methods. In case of any questions, please do not hesitate to contact us at our official student IDs : first.lastname@studium.uni-hamburg.de. We are also available on LinkedIn for professional networking in case of other queries.
Data curators /DDLitLab data literacy project team
Ferdinand Hölzl
Leidy Gicela Vergara Lopez
Shivanshi Asthana
Shuyue Qu
Sojung Oh
Juan Miguel Rodriguez Lopez
{"references": ["https://gitlab.rrz.uni-hamburg.de/incorporating-ml-models-for-spatial-demand-prediction-of-weekend-markets-in-the-city-of-hamburg-germany/incorporating-ml-models-for-spatial-demand-prediction-of-weekend-markets-in-the-city-of-hamburg-germany", "https://www.cliccs.uni-hamburg.de/about-cliccs/news/2023-news/2023-08-24-ddlitlab-event.html"]}