April 2025 – pradyumnastats-522

April 15, 2025May 5, 2025

Week-11

I finalized the findings and evaluated temporal trends across the clusters. There was an increase in protests over recent years, and a decline in average fatalities per event. These insights added valuable depth to our analysis. Wrapping up the model and contributing to the report gave me a well-rounded experience in both technical and interpretive aspects of data science. Project 2 is now complete and ready for submission!

April 8, 2025May 5, 2025

Week-10

After cleaning and pre-processing, I implemented K-means clustering on the event locations. Using the Geopy library for geodesic distance and running the elbow method, we chose 4 as the optimal number of clusters. The results clearly grouped the country into different conflict zones based on event types. This week helped deepen my understanding of unsupervised learning and clustering concepts.

This week’s focus was on interpreting the clusters and understanding what each one represented. Northern regions mostly experienced violence against civilians, while central regions had more protests. I found it interesting how clustering highlighted patterns that might be hidden in raw data. We also visualized the clusters on maps, which made our results easier to communicate to others.

April 1, 2025May 5, 2025

Week-9

This week, Final tuning and testing of the model were done this week. I compared multiple models and evaluated them using confusion matrices and ROC curves. We selected the best-performing model and generated predictions for unseen data. It was interesting to note how model performance varied depending on the features included. By now, the results were consolidated into a clean report with graphs and interpretation. Project 1 is complete!,

For Project 2, I began analyzing the ACLED US dataset on political violence. I noticed right away that it was much larger and more diverse in terms of event types. The data required geolocation checks, and several columns needed encoding or transformation. I also started thinking about which clustering method would suit our problem—K-means seemed like a strong candidate due to the spatial nature of the data.