Top Technical Skills: Python, R, SQL, Cypher (Neo4j)

Greetings! I’m Maria Aroca — a data scientist, researcher, and purpose-driven builder of systems that serve people. With a PhD in Political Science and a Master’s in Data Science, I lead CIDEACC — the Center for Development, Innovation, and Artificial Intelligence at Clínica de La Costa — where we use data science and AI to solve real-world challenges in healthcare. I’m especially passionate about mentoring local talent and creating opportunities for others to grow alongside the technology we’re developing.

About me

Areas of Interest

Education

  • M.S., Data Science - IU-Bloomington (Dec 2024)

  • Ph.D., Political Science - Rice University (May 2022)

  • M.A., Political Science - Universidad de los Andes (Aug 2014)

  • B.A., Political Science - Universidad de los Andes (Feb 2012)

Work Experience

Lead Researcher @ CIDEACC (September 2024 - Present)

  • Leading AI-driven research projects focusing on predictive diagnostics, personalized medicine, and optimizing clinical workflows through machine learning algorithms and data analytics.

  • Collaborating with clinicians and engineers to develop AI solutions for healthcare, including the application of natural language processing and computer vision to assist in medical diagnostics.

Data Scientist @ 20Moves (June 2023- August 2024)

  • Developed a knowledge graph with over 22 million nodes for 20 Moves, enabling complex analysis of New York's political landscape, enhancing strategic decision-making for social movements.

  • Implemented Python pipelines and Cypher queries in Neo4j, integrating unstructured, structured, API, web-scraped, and LLM-retrieved data into actionable insights, enhancing the organization's data analytics and application usability.

  • Utilized network analysis techniques to identify key influencers and pathways in political data, significantly enhancing the application's utility for navigating New York's political network.

Data Scientist @ Secretaría de Transparencia - Presidencia de la República de Colombia (November 2020- December 2021)

  • Designed and implemented ETL pipelines and reporting tools for portal.paco.gov.co, analyzing procurement activities of Colombian government entities, supporting government transparency. The production-grade code continues to function effectively.

  • Developed strategies to integrate machine learning and graph databases to detect potential corruption in government procurement processes.

  • The tool has become a critical resource for journalists, citizens, and government officials, used in high-profile cases to expose irregularities in procurement activities, demonstrating its significant impact on promoting transparency and accountability.

Research Assistant @ Congreso Visible - Universidad de los Andes (July 2010 - July 2014)

  • Collected, cleansed, and structured extensive datasets on legislative activities for the transparency initiative of Congreso Visible, aiding legislators, lobbyists, and citizens in accessing current information on the activity and members of the Colombian national legislature.

  • Utilized statistical analysis to produce detailed reports on legislative activity in the Colombian Congress.