David Pilato
πŸ‡«πŸ‡· β€” everywhere πŸ‡§πŸ‡ͺ β€” everywhere πŸ‡¬πŸ‡§ β€” everywhere πŸ‡³πŸ‡± β€” everywhere

Developer | Evangelist

Talks

Search: a new era
🏳️ EN 🏳️ FR #elasticsearch #vector-search #machine-learning

Search is not just traditional TF/IDF any more but the current trend of machine learning and models has opened another dimension for search.

This talk gives an overview of:

  • β€œClassic” search and its limitations
  • What is a model and how can you use it
  • How to use vector search or hybrid search in Elasticsearch
  • Where OpenAI’s ChatGPT or similar LLMs come into play to with Elastic

The main demo covers how to generate embeddings from a music and then use the techniques we learned to propose the most probable version of it when we hum a song 🎢🎸🎻.

Search: a new era
And the beats go on !
🏳️ EN 🏳️ FR #elasticsearch #beats

https://youtu.be/fOaxEa5ONJw

Discover the new Data Shippers for Elasticsearch :

  • Packetbeat : sniff network protocols
  • Topbeat : collect metrics
  • Filebeat : analyze logs in real time or send them to logstash for enrichment

And how to contribute to the mix and add your own beats with Libbeat.

And the beats go on !
Make sense of your (BIG) data!
🏳️ EN 🏳️ FR #elasticsearch #kibana

Elasticsearch: you know, for search! But you can use it as well to compute information on live data.

During this session, you will discover how elasticsearch actually works behind the scene and why this great open source software is really more than “search”. We will inject marketing data into elasticsearch and build live a dashboard using Kibana, generic and powerful visualisation tool under Apache2 license. In minutes, you will know how to build YOUR own dashboard and make sense of YOUR data!

Make sense of your (BIG) data!
Managing your black Friday logs
🏳️ EN 🏳️ FR #elasticsearch
TODO: add abstract
Managing your black Friday logs
Node discovery in Cloud environment
🏳️ EN 🏳️ FR #elasticsearch
TODO: add abstract
The Art of DeeJaying
🏳️ EN 🏳️ FR #music

In this session, we’ll discover some techniques used by DJs to keep you dancing all night long:

  • BPM (Beats Per Minute) adjustment
  • Harmonic adjustment
  • Frequency equalizers
  • Cue Points
  • Loops

At the end, we’ll try to do a multi-handed mix to put into practice what we’ve learned.

Main equipment used:

The Art of DeeJaying
Hands on elasticsearch / Kibana
🏳️ EN 🏳️ FR #elasticsearch #kibana

Let’s play with elasticsearch and Kibana

We will install elasticsearch, Kibana and Marvel and will use that tools to:

  • index/update/get/delete documents
  • search
  • compute
  • build dashboards to make sense of marketing data
  • snapshot your data and restore them

Intended audience

Audience should already know some basics like JSON and need to have a laptop with:

  • a JVM (1.7 preferred)
  • a web browser
Hands on elasticsearch / Kibana
Elasticsearch
🏳️ EN 🏳️ FR 🏳️ FR (with AI) #elasticsearch

Are you still using SQL queries for searches? Are your users frustrated about not being able to search across all categories? Is your average response time over half a second with just a few million documents? Does it take you three days to generate statistics on your data? Are you looking to provide a Google-like search experience for your information system?

If so, this conference is for you.

During the session, David will explain the transition from SQL searches to Elasticsearch, highlighting the benefits of this engine compared to a pure Lucene solution. Topics will include:

  • Why using a search engine?
  • Indexing
  • Searching
  • Aggregations and the concept of faceted navigation
  • Analysis and mapping (if time allows)
  • The community

Depending on interest, we can also touch on recent developments, such as vector search and the ES|QL language/engine in a simplified manner.

Elasticsearch
Elasticsearch Query Language: ES|QL
🏳️ EN 🏳️ EN - Slideless 🏳️ FR 🏳️ FR - Slideless #elasticsearch #esql

Elasticsearch and Kibana added a brand new query language: ES|QL β€” coming with a new endpoint (_query) and a simplified syntax. It lets you refine your results one step at a time and adds new features like data enrichment and processing right in your query. And you can use it across the Elastic Stack β€” from the Elasticsearch API to Discover and Alerting in Kibana. But the biggest change is behind the scenes: Using a new compute engine that was built with performance in mind.

Join us for an overview and a look at syntax and internals.

Elasticsearch Query Language: ES|QL
Indexing your office documents with Elastic and FSCrawler
🏳️ EN 🏳️ FR #elasticsearch #fscrawler

You have plenty of Open Office, Microsoft Office, PDF, image documents and you may want to be able to search for their metadata and content. How can you do that?

In this talk, David will explain how Apache Tika can be used for that and how to combine this fantastic library with Elastic Stack:

Indexing your office documents with Elastic and FSCrawler
Ingest node: (re)index and enrich documents in Elasticsearch
🏳️ EN 🏳️ FR #elasticsearch

When you ingest data into Elasticsearch, you may need to perform fairly simple transformation operations. Until now, these operations had to be done outside of Elasticsearch, before the actual indexing.

Welcome Ingest node! A new type of node that allows you to do just that.

This talk explains the concept of Ingest Node, how to integrate it with the rest of the Elastic software suite, and how to develop your own Ingest plugin in practice by showing how I developed the ingest-bano plugin to enrich French postal addresses and/or geographic coordinates (for now).

This talk will also cover the reindex API which can also benefit from the ingest pipeline to modify your data on the fly during reindexing.

Ingest node: (re)index and enrich documents in Elasticsearch
Advanced (elastic)search for your legacy application
🏳️ EN 🏳️ FR #elasticsearch #spring-boot #java

How do you mix SQL and NoSQL worlds without starting a messy revolution?

This live coding talk will show you how to add Elasticsearch to your legacy application without changing all your current development habits. Your application will have suddenly have advanced search features, all without the need to write complex SQL code!

David will start from a Spring Boot/MySQL based application and will add a complete integration of Elasticsearch, all live from the stage during his presentation.

Advanced (elastic)search for your legacy application
Monitor Your Java Applications with the Elastic Stack: Logs, Metrics, Pings, and Traces
🏳️ EN 🏳️ FR #elasticsearch #kibana #apm #java

This talk gives an overview on how to monitor distributed applications. We dive into:

  • System metrics: Keep track of network traffic and system load.
  • Application logs: Collect structured logs in a central location.
  • Uptime monitoring: Ping services and actively monitor their availability and response time.
  • Application metrics: Get the information from the application’s metrics and health endpoints via REST or JMX.
  • Request tracing: Trace requests through a distributed system and show how long each call takes and where errors are happening.

And we will do all of that live, since it is so easy and much more interactive that way.

Monitor Your Java Applications with the Elastic Stack: Logs, Metrics, Pings, and Traces
🎹🎻🎸 Searching for similar music tracks 🎼🎢
🏳️ EN 🏳️ FR #elasticsearch #vector-search

In this session, we will use the principles of vector search to find pieces of music 🎢 that are (maybe) similar to others. To do this, we will review the principles of generating embeddings to represent any type of data, whether textual or binary.

Join us for an overview and a look at syntax and internals.

🎹🎻🎸 Searching for similar music tracks 🎼🎢
Enriching postal addresses with Elastic stack
🏳️ EN 🏳️ FR #elasticsearch

Come and learn how you can enrich your existing data with normalized postal addresses with geo location points thanks to open data and BANO project.

Most of the time postal addresses from our customers or users are not very well formatted or defined in our information systems. And it can become a nightmare if you are a call center employee for example and want to find a customer by its address. Imagine as well how a sales service could easily put on a map where are located the customers and where they can open a new shop…

Let’s take a simple example:

{
  "name": "Joe Smith",
  "address": {
    "number": "23",
    "street_name": "r verdiere",
    "city": "rochelle",
    "country": "France"
  }
}

Or the opposite. I do have the coordinates but I can’t tell what is the postal address corresponding to it:

{
  "name": "Joe Smith",
  "location": {
    "lat": 46.15735,
    "lon": -1.1551
  }
}

In this live coding session, I will show you how to solve all those questions using the Elastic stack.

Enriching postal addresses with Elastic stack
Want to boost your career? Open source yourself!
🏳️ EN 🏳️ FR #open-source
Come discover through a true story how a simple little answer in a discussion forum on an unknown project can completely change and accelerate your career.
Want to boost your career? Open source yourself!
Randomized testing: Gotta Catch 'Em All
🏳️ EN 🏳️ FR #java #testing

Chance does things well.

If we apply this idea to unit tests or integration tests, we can make our tests much more unpredictable β€” and as a result, uncover issues that our minds would never have dared to imagine! For example, I recently discovered a bug in a configuration management library that occurs when the Locale is set to AZ. πŸ€¦πŸΌβ€β™‚οΈ

Another, even simpler, example:

int input = generateInteger(Integer.MIN_VALUE, Integer.MAX_VALUE);
int output = Math.abs(input);

This can generate -2147483648… which is quite unexpected for an absolute value! πŸ˜‰
Randomized tests can uncover these twisted edge cases… That’s what the Elasticsearch team has been doing for years using the RandomizedTesting framework to test all their Java code.

Add to that real integration tests using TestContainers, and you’ll have a complete approach to tests that regularly fail!

After this talk, you’ll never look at the random() function the same way again β€” and you’ll discover how (bad) luck can actually help you! πŸ€

Randomized testing: Gotta Catch 'Em All
A NoSQL search engine for searching^H^H^H^H^H^H^H^H finding...
🏳️ EN 🏳️ FR #elasticsearch

You are still searching in your data with SELECT * FROM person WHERE name like '%david%pilato%" ?

Beyond the performance gains, are you sure you are returning the most relevant results for your users first?

Discover how a search engine will help you answer the questions asked by your users, in a relevant and efficient way, while providing features for analyzing the results and this, whatever the volume…

A NoSQL search engine for searching^H^H^H^H^H^H^H^H finding...
do MORE with stateLESS Elasticsearch
🏳️ EN 🏳️ FR #elasticsearch

How would you create Elasticsearch if you were starting this project in 2025?

  • Decouple compute from storage
  • Externalize persistence and replication management to a blob store like S3, Google Cloud Storage or Azure Blob Storage
  • Dynamically add or remove instances
  • Have the right default values
  • And a super clear and smooth path for developers

This is exactly what we did with Elastic Serverless.

In this session, you will discover how we redesigned Elasticsearch to do more with a Stateless architecture that can run queries on cold storage.

do MORE with stateLESS Elasticsearch
Identify threats with Elastic SIEM
🏳️ EN 🏳️ FR #elasticsearch #security

Knowing what is going on in your environment is an important part of staying on top of security issues. But how do you capture relevant metrics and visualize them? One widely-used tool for that job is the Elastic Stack, formerly known as the ELK stack. This talk shows how to ingest relevant metrics from your network and hosts as well as how to easily visualize them to find suspicious patterns and behaviors. We will be also using the latest tool named SIEM.

We will use real-world honeypot data for this example:

  • The first step is to parse and enrich the data, so we can identify actual attacks, their origin, and more.
  • Then we store and explore the data to find meaningful insights.
  • Which leads us to visualize specific attributes β€” like the location of an attacker or patterns in the attacks.
  • Building upon this we can combine visualizations into dashboards, giving a broader overview.
  • Finally we will use the Kibana SIEM app to see how everything is now getting easy to track for attacks.

Everything done live.

Identify threats with Elastic SIEM
Elastify your app: from SQL to NoSQL
🏳️ EN 🏳️ FR #elasticsearch #nosql

During this “live coding” talk, Tugdual and David will move an old-fashion full SQL application to the NoSQL world. Using CouchBase and Elasticsearch, they will show all gains you can have with this new architecture:

  • Easyness
  • Elasticity (scalablity)

Following points will be covered:

  • Document Oriented Model
  • JSon
  • REST
  • Caching / Memcache
  • Full text search
  • Building live dashboards with Kibana
Testcontainers for real integration tests with Elasticsearch
🏳️ EN 🏳️ FR #elasticsearch #java #testing

How are you testing with your database?

  • Mocking is not an option since you want to test the actual system.
  • In-memory databases, like H2 or HSQLDB, have subtle differences and not all datastores have in-memory cousins.
  • Managing and running tests in parallel against the actual datastore is a pain.

So what is the solution? There are some very neat solutions based on containers, namely the Docker-Maven-Plugin and Testcontainers. From your tests you can start a lightweight, throwaway instance of your datastore and this talk will walk you through how to do that.

And we will introduce the module we built for Elasticsearch: https://www.testcontainers.org/modules/elasticsearch/.

Testcontainers for real integration tests with Elasticsearch
Don't Panic: The Lazy Speaker's Guide to CFP Triage
🏳️ EN 🏳️ FR #ai #automation

Every new CFP triggers the same ritual:

  • Open the conference page
  • Copy the dates
  • Check for duplicates in GitHub
  • Find the right labels
  • Write the issue
  • Add a comment with talk suggestions

⏳ 42 minutes for 3 CFPs! But actually, 20 person-days per year across our team of 13 DevRels.

For. Copy-pasting.

My CFO nearly cried. 😭

So I looked around. Elastic already gave me Elasticsearch for semantic search, a native MCP server in Kibana, a conversational agent, a workflow engine and JINA Reader to parse the web. GitHub offered its own MCP server and GitHub Actions to handle updates.

In this session, I’ll show you β€” live, no slides β€” how I put all of this together to build an agent that does the work for us. One URL in ➑️ one complete GitHub issue out. With human validation, because I trust AI but not blindly. πŸ‘€

Whether you track CFPs or not, the patterns are here. Come for the lazy automation, stay for the architecture. So long, and thanks for all the fish. 🐟

Don't Panic: The Lazy Speaker's Guide to CFP Triage