The second stage of the Data Clustering Contest that was held by the Telegram messenger is now completed. Forty submissions of the participants are now available for testing and can be checked out at contest.com. The second stage lasted for two weeks, and its prize fund was 100,000 euros.

Telegram launched this contest back in 2019, with the first stage starting on November 18 and ending on December 2, and the second stage beginning on May 11 and ending on May 25.

Telegram Launched a Contest to Create a News Aggregator
The participants of the second stage of the Data Clustering Contest should develop an algorithm that will qualitatively rank the information. Winners will share a prize fund of $100,000.

In the first stage, a total of 112 algorithms were submitted, and the prize fund of this stage was 100,000 euros. The authors of the best solution were able to proceed to the second stage of the contest, receiving a chance to win another €100,000-reward. However, even those who didn’t take part in the first round were invited to try their luck in the second one.

The main task of the Data Clustering Contest was to create a module that could power a news aggregator as well as sorting and grouping algorithm for news articles. The best solutions will be selected by the judges after testing and identifying bugs. The final versions of the works will be further developed.

The participants needed to create and improve clustering algorithms for highlighting articles in English and Russian as well as for grouping them into categories and stories. Along with this, the algorithm should analyze, store, and index incoming articles, as well as optimize the index for queries. In addition, it should be able to form a list of stories on the specified topics for a specified period, sorted by importance. All these articles should be accessible to a wide range of readers from Russia, and news articles in English should be relevant for a wide range on international readers.

Contestants submitted their entries as standalone tgnews applications. At the same time, they should be able to work in two modes: CLI-interface and in HTTP-server mode.