mirror of
https://github.com/twitter/the-algorithm-ml.git
synced 2024-12-22 22:31:48 +01:00
Update README.md
This commit is contained in:
parent
78c3235eee
commit
31a6d5125b
@ -1,19 +1,24 @@
|
||||
Twhin in torchrec
|
||||
# Twhin in torchrec
|
||||
|
||||
This project contains code for pretraining dense vector embedding features for Twitter entities. Within Twitter, these embeddings are used for candidate retrieval and as model features in a variety of recommender system models.
|
||||
This project contains code for pretraining dense vector embedding features for Twitter entities.
|
||||
Within Twitter, these embeddings are used for candidate retrieval and as model features in a variety of recommender system models.
|
||||
|
||||
We obtain entity embeddings based on a variety of graph data within Twitter such as:
|
||||
"User follows User"
|
||||
"User favorites Tweet"
|
||||
"User clicks Advertisement"
|
||||
* "User follows User"
|
||||
* "User favorites Tweet"
|
||||
* "User clicks Advertisement"
|
||||
|
||||
While we cannot release the graph data used to train TwHIN embeddings due to privacy restrictions, heavily subsampled, anonymized open-sourced graph data can used:
|
||||
https://huggingface.co/datasets/Twitter/TwitterFollowGraph
|
||||
https://huggingface.co/datasets/Twitter/TwitterFaveGraph
|
||||
While we cannot release the graph data used to train TwHIN embeddings due to privacy restrictions, heavily subsampled, anonymized open-sourced graph data can used:
|
||||
* https://huggingface.co/datasets/Twitter/TwitterFollowGraph
|
||||
* https://huggingface.co/datasets/Twitter/TwitterFaveGraph
|
||||
|
||||
The code expects parquet files with three columns: lhs, rel, rhs that refer to the vocab index of the left-hand-side node, relation type, and right-hand-side node of each edge in a graph respectively.
|
||||
The code expects parquet files with three columns:
|
||||
* lhs
|
||||
* rel
|
||||
* rhs
|
||||
that refer to the vocab index of the left-hand-side node, relation type, and right-hand-side node of each edge in a graph respectively.
|
||||
|
||||
The location of the data must be specified in the configuration yaml files in projects/twhin/configs.
|
||||
The location of the data must be specified in the configuration yaml files in `projects/twhin/configs`.
|
||||
|
||||
|
||||
Workflow
|
||||
|
Loading…
Reference in New Issue
Block a user