A Heisenbug caused by Elastic’s refresh mechanism

I was designing a few tests that validate the ingestion pipeline of DataScouting’s MediaScouting Core. The test spins up an elastic container, created a test index and indexed some data. Then, my tests would run aggregations against the index and, after some post-processing on Elastic’s response, assert the results. The setup was very simple.

public abstract class AbstractAggregationTest {

    public static final String IMAGE =
            "docker.elastic.co/elasticsearch/elasticsearch:7.9.3";
    private static final RestHighLevelClient restHighLevelClient;
    protected static final ElasticsearchContainer ELASTICSEARCH_CONTAINER;

    static {
        ELASTICSEARCH_CONTAINER = new ElasticsearchContainer(IMAGE);
        ELASTICSEARCH_CONTAINER.withReuse(true);
        ELASTICSEARCH_CONTAINER.start();

        final String elasticEndpoint = ELASTICSEARCH_CONTAINER
                .getHttpHostAddress();
        restHighLevelClient = new RestHighLevelClient(RestClient
                .builder(HttpHost.create(elasticEndpoint)));

        setupIndex();
        indexTestData();
    }

    private static void setupIndex() {
        // Creates the index
    }

    private static void indexTestData() {
        // Builds an BulkRequest and indexes the data
    }
}

The problem was that every time I ran the test, the post-processor would report different numbers, so assertions would fail. Interestingly, adding a breakpoint right after the test data got indexed would solve the problem! I checked Elastic’s response, and there lied the problem. Elastic sent out a different response every time I ran the test. Not only that, but a count of the documents in the index would also report different numbers. Adding a small wait time after indexing would also solve the problem.

static {
    // ... testcontainer setup

    setupIndex();
    indexTestData();
    countDocsInIndex(); // <-- changes every time

    Thread.sleep(1000);
    countDocsInIndex(); // <-- correct number
}

That’s when I remembered about elastic’s refresh mechanism. This is the mechanism built on top of Lucene’s caching. You can read more about refresh on Elastic’s guide, Near real-time search. So the solution was simply forcing a refresh after indexing the data and before prociding with the tests.

static {
    // ... testcontainer setup

    setupIndex();
    indexTestData();

    // Make sure the indexed documents are readily available
    restHighLevelClient
            .indices()
            .refresh(new RefreshRequest(TEST_INDEX), DEFAULT));
}

Elastic's_A Heisenbug caused by Elastic's refresh mechanism

DataScouting operates MediaScouting Core, a white labeled delivery platform that aggregates online, broadcast, print and social media content. It has been designed by our team and it is being used by media monitoring companies, media agencies, PR agencies, advertisement companies, government bodies, libraries and archives.

 

Pin It

Comments are closed.