Loading…
Venue: Z2.10 clear filter
arrow_back View All Dates
Saturday, May 24
 

9:30am CEST

Teaching LLMs to build your Machine Learning Models
Saturday May 24, 2025 9:30am - 10:45am CEST
In this practical session, participants will learn how LLMs like ChatGPT can assist in writing machine learning code for journalistic investigations.

We’ll start by prompting ChatGPT to generate code for analyzing a small dataset. Then, we’ll apply the code to a larger dataset locally. After attending this session, participants will be able to train and use this machine learning model on their own.

This method was used by Frontstory.pl to analyze thousands of messages on Telegram and reveal the scale of drug trafficking activity in Poland.

To follow along, participants should be comfortable using Python and Jupyter Notebook.
Saturday May 24, 2025 9:30am - 10:45am CEST
Z2.10

11:15am CEST

More than just the Wayback Machine: how to investigate deleted and archived content
Saturday May 24, 2025 11:15am - 12:30pm CEST
Even among investigative journalists, web archives tend to be underrated – and undertaught. This hands-on session introduces journalists to powerful techniques for using web archives.
Participants will learn how to recover deleted or hidden content and archive key material from platforms like Instagram and X.
Using real-world examples, we’ll demonstrate how these skills can strengthen reporting across a wide range of stories, from everyday reporting to investigative longreads.
After this session, you will be able to retrieve archived content, recover deleted posts (not necessarily the same things!), and preserve online material using advanced web archiving tools and techniques. We will teach participants how to tweak the URL and use the asterisk, and we will demonstrate why the "Golden Hour" of archiving is so important in breaking news situations.
No prior experience is required—just an interest in digital sleuthing and a willingness to explore new tools.
Please bring a laptop with you, preferably with the Chrome browser installed.
Saturday May 24, 2025 11:15am - 12:30pm CEST
Z2.10

1:45pm CEST

Together at last: R and Python united in the Positron IDE
Saturday May 24, 2025 1:45pm - 3:00pm CEST
For years datajournalists have been forced to choose between learning R or Python in order to do data analysis with a scripted language. This meant the choice of IDE (integrated development environment – the app for writing and managing scripts and files) was always a  defining decision.

R users mostly turned to RStudio to maintain R and run scripts, make plots etc. Python users have had a variety of options – Google Colab, Jupyter, Anaconda etc to manage their scripts and projects.

Now there’s a program built to handle both languages in parallel (but not quite simultaneously!) - it's called Positron.

In this session we will introduce you to the Positron program. We will show you the interface, and how to get started with your usual coding language, before working through some scenarios where being able to move quickly from one language to the other is desirable. (And if you have examples of times when you’ve needed this facility, please bring them to this session)

You will ideally have some experience of R or Python, and some appetite for using the other language, perhaps even on deadline. If you want to follow along in the session, install Positron beforehand from https://positron.posit.co/
Speakers
avatar for Jonathan Stoneman

Jonathan Stoneman

Arena for Journalism in Europe
Saturday May 24, 2025 1:45pm - 3:00pm CEST
Z2.10

3:30pm CEST

💡Streamlit for building tools and collaborate with non-coders
Saturday May 24, 2025 3:30pm - 4:45pm CEST
With Streamlit, you can set up a web page in just a few lines of Python code to share your findings with your team or your audience – or to collect information from them. Use it to swiftly try out an idea for publication before asking your IT department to develop it, or to let a colleague make use of a Python-scripted tool you've written. Or build yourself a chatbot to help navigate your own research, local and safe on your computer.
In this session, we’ll cover the basics of Streamlit and build a page where users can upload a PDF along with some information, send it to a Python function for processing, and display the results. More advanced users will learn how to build an LLM-powered chatbot.
Streamlit is a Python library, so you should have a basic understanding of Python. You also need to be the admin of your computer, or at least have permission to start a local web server on it. If you want to build a chatbot, you’ll need to install Ollama and download a model such as Gemma3 (ollama.com/library/gemma3) before the session starts.
Saturday May 24, 2025 3:30pm - 4:45pm CEST
Z2.10

5:15pm CEST

Scraping the unscrapable: advanced approaches to deal with complex sites and evade anti-scraping systems
Saturday May 24, 2025 5:15pm - 6:30pm CEST
Scraped data can often be the backbone of an investigation, but some websites are more difficult to scrape than others. This session will cover best practices for dealing with tricky sites, including coping with captchas, IP blocks, and browser fingerprinting.

This is an advanced session aimed at people who already have experience of writing code to scrape websites and want to move up to the next level: participants will leave with an understanding of how to approach hard-to-scrape websites, plus the tradeoffs and costs of these approaches.
Speakers
Saturday May 24, 2025 5:15pm - 6:30pm CEST
Z2.10
 
Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -