Loading…
Venue: Z2.10 clear filter
arrow_back View All Dates
Friday, May 23
 

1:15pm CEST

🤖 How LLMs can classify thousands of records in minutes
Friday May 23, 2025 1:15pm - 2:30pm CEST
You find yourself staring at a dataset with tens or hundreds of thousands of rows. Maybe you want to get up-to-date FOIA contact details for all government departments in your country, or to find out which political donors have links to the fossil fuels industry. What do you do?

Large Language Models (LLMs) can help journalists automate simple research and classification tasks that would take an unreasonably long time to do manually.

In this session, we'll outline how Global Witness has used LLMs, search engines and web scraping to help us identify fossil fuel lobbyists at COP29. These techniques can be applied to other investigations and research tasks.

The workshop will cover:

- An interactive classification demo
- Some basic tips on setting up a research/classification project
- The challenges of doing AI research at scale and how to address them
- Using more advanced tools

After attending this session, you will be able to take an existing dataset and automatically augment it with new data, opening up the potential for new stories and investigations.

If you want to follow along with the classification demo, you'll need to be able to run Jupyter Notebooks on your device or have a Google account. A basic understanding of Python would be useful, but we won't be writing any new code.
Friday May 23, 2025 1:15pm - 2:30pm CEST
Z2.10

3:00pm CEST

Cracking the code: how to use RegEx in your investigations
Friday May 23, 2025 3:00pm - 4:15pm CEST
When you unlock the power of regular expressions (RegEx) you supercharge your spreadsheet!

Participants will learn how to extract hidden patterns from text, clean messy datasets, and automate repetitive tasks using the RegEx formulas within Google Sheets (but this session is also a good intro if you want to apply it in other code).

Through practical examples—like extracting donations, cleaning salutation-heavy lists, and extracting postcodes—you will leave with the confidence to apply RegEx in your day-to-day data work.

Attendees will receive a RegEx cheat sheet (customisable for their own use) and a practical demo spreadsheet to take their skills to the next level. No prior experience of RegEx is required, but you should be comfortable writing formulas in Google Sheets/Excel. I will be sharing a Google Sheet containing the data for which you will require a Google account.
Speakers
Friday May 23, 2025 3:00pm - 4:15pm CEST
Z2.10

4:45pm CEST

Investigating built expansion on protected natural areas 🌳: learn the basics of PostGIS
Friday May 23, 2025 4:45pm - 6:00pm CEST
Working with geospatial datasets is often critical to investigations dealing with where things are located, where an event occurred, land ownership, and the environment.

In this workshop, participants will learn the basics of using PostGIS to query and join geospatial datasets using the Arena+ "Europe in Grey" investigation as a case study.

We’ll run through a brief overview of PostGIS, the spatial data types, indexes, and functions it adds to Postgres, and a few of its strengths and weaknesses as a tool.

Participants will connect to an existing database containing a dataset of built expansion in Europe and a few datasets used in the Arena investigation. They’ll be guided through writing queries to reproduce some of the investigation’s results, answering questions like:
Which European country has lost the greatest proportion of its wild areas since 2018?
Which protected areas have had the most building on them?

To take part in this workshop, participants should feel comfortable writing basic SQL queries.
Participants should bring their laptops with either DBeaver or their favorite SQL client tool installed.
Friday May 23, 2025 4:45pm - 6:00pm CEST
Z2.10
 
Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -