eaae3e286e
Setting up the different tools for development can be hard for new contributors. This commit adds a Nix flake that provides a reproducible development environment which is easy to set up and ensures consistent code style across all subprojects. The initial flake contains `git`, `pre-commit`, `which` and `bazelisk` wrapped as a `bazel` executable. Dependencies like Rust, Scala etc may be added in the `packages` attribute in the future. The current pre-commit configuration is a no-op. Future commits may enable the currently disabled parts of `pre-commit-hooks.nix` for a gradual transition. Supports `(x86_64|aarch64) (Linux|MacOS)` systems. |
||
---|---|---|
ann/src/main | ||
ci | ||
cr-mixer | ||
docs | ||
follow-recommendations-service | ||
graph-feature-service | ||
home-mixer | ||
navi | ||
product-mixer | ||
recos-injector | ||
science/search/ingester/config | ||
simclusters-ann | ||
src | ||
timelineranker | ||
timelines/data_processing/ad_hoc/earlybird_ranking/earlybird_ranking | ||
trust_and_safety_models | ||
twml | ||
visibilitylib | ||
.envrc | ||
.gitignore | ||
COPYING | ||
flake.lock | ||
flake.nix | ||
pre-commit-hooks.nix | ||
README.md |
Twitter Recommendation Algorithm
The Twitter Recommendation Algorithm is a set of services and jobs that are responsible for constructing and serving the Home Timeline. For an introduction to how the algorithm works, please refer to our engineering blog. The diagram below illustrates how major services and jobs interconnect.
These are the main components of the Recommendation Algorithm included in this repository:
Type | Component | Description |
---|---|---|
Feature | SimClusters | Community detection and sparse embeddings into those communities. |
TwHIN | Dense knowledge graph embeddings for Users and Tweets. | |
trust-and-safety-models | Models for detecting NSFW or abusive content. | |
real-graph | Model to predict likelihood of a Twitter User interacting with another User. | |
tweepcred | Page-Rank algorithm for calculating Twitter User reputation. | |
recos-injector | Streaming event processor for building input streams for GraphJet based services. | |
graph-feature-service | Serves graph features for a directed pair of Users (e.g. how many of User A's following liked Tweets from User B). | |
Candidate Source | search-index | Find and rank In-Network Tweets. ~50% of Tweets come from this candidate source. |
cr-mixer | Coordination layer for fetching Out-of-Network tweet candidates from underlying compute services. | |
user-tweet-entity-graph (UTEG) | Maintains an in memory User to Tweet interaction graph, and finds candidates based on traversals of this graph. This is built on the GraphJet framework. Several other GraphJet based features and candidate sources are located here | |
follow-recommendation-service (FRS) | Provides Users with recommendations for accounts to follow, and Tweets from those accounts. | |
Ranking | light-ranker | Light ranker model used by search index (Earlybird) to rank Tweets. |
heavy-ranker | Neural network for ranking candidate tweets. One of the main signals used to select timeline Tweets post candidate sourcing. | |
Tweet mixing & filtering | home-mixer | Main service used to construct and serve the Home Timeline. Built on product-mixer |
visibility-filters | Responsible for filtering Twitter content to support legal compliance, improve product quality, increase user trust, protect revenue through the use of hard-filtering, visible product treatments, and coarse-grained downranking. | |
timelineranker | Legacy service which provides relevance-scored tweets from the Earlybird Search Index and UTEG service. | |
Software framework | navi | High performance, machine learning model serving written in Rust. |
product-mixer | Software framework for building feeds of content. | |
twml | Legacy machine learning framework built on TensorFlow v1. |
We include Bazel BUILD files for most components, but not a top level BUILD or WORKSPACE file.
Development environment
The unified developement environment is still in an experimental state. Contributions are welcome!
You can set up a development environment with various tooling via the nix flake:
-
Install the nix package manager and enable flakes.
-
Enter the development shell by running
nix develop
in the root directory of the repository. -
Optionally, install direnv which will automatically drop you into the development environment when you
cd
into your local clone.
Contributing
We invite the community to submit GitHub issues and pull requests for suggestions on improving the recommendation algorithm. We are working on tools to manage these suggestions and sync changes to our internal repository. Any security concerns or issues should be routed to our official bug bounty program through HackerOne. We hope to benefit from the collective intelligence and expertise of the global community in helping us identify issues and suggest improvements, ultimately leading to a better Twitter.
Read our blog on the open source initiative here.