transformer-architecture

Here are 219 public repositories matching this topic...

jshuadvd / LongRoPE

Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper

nlp machine-learning natural-language-processing ai deep-learning transformers artificial-intelligence gpt language-model natural-language-inference natural tokenization natural-language-understanding attention-is-all-you-need attention-mechanisms transformer-architecture natural-language-procressing tokenizers llm

Updated Jun 1, 2024
Python

razamehar / IMDB-Sentiment-Analysis-BoW-S2S-Models

Star

Sentiment analysis on the IMDB dataset using Bag of Words models (Unigram, Bigram, Trigram, Bigram with TF-IDF) and Sequence to Sequence models (one-hot vectors, word embeddings, pretrained embeddings like GloVe, and transformers with positional embeddings).

python sentiment-analysis tensorflow word-embeddings bag-of-words glove-embeddings sequence-to-sequence-models imdb-dataset transformer-architecture term-frequency-inverse-document-frequency one-hot-encoded-vectors

Updated May 31, 2024
Jupyter Notebook

Awni00 / abstract_transformer

Star

This is the project repo associated with the paper "Disentangling and Integrating Relational and Sensory Information in Transformer Architectures" by Awni Altabaa, John Lafferty

machine-learning attention relational-learning relational-reasoning transformer-architecture machine-learning-research

Updated May 31, 2024
Jupyter Notebook

songqiang321 / Awesome-AI-Papers

Star

This repository is used to collect papers and code in the field of AI.

Updated May 31, 2024

gustavecortal / transformer

Star

Slides from my NLP course on the transformer architecture

nlp tutorial slides transformers transformer transformer-architecture transformer-models

Updated May 30, 2024

RuochenT / transformer_hybrid

Star

This study aims to investigate the effectiveness of three Transformers (BERT, RoBERTa, XLNet) in handling data sparsity and cold start problems in the recommender system. We present a Transformer-based hybrid recommender system that predicts missing ratings and ex- tracts semantic embeddings from user reviews to mitigate the issues.

matrix-factorization transformer bert multilabel-classification sentence-embeddings hybrid-recommender-system roberta transformer-architecture xlnet cold-start-problem

Updated May 30, 2024
Jupyter Notebook

zhongkaifu / Seq2SeqSharp

Star

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.

image translation deep-learning neural-network gpu text machine-translation cuda transformer lstm seq2seq sequence-to-sequence tensor encoder-decoder attention-model transformer-encoder transformer-architecture vision-transformer

Updated May 29, 2024
C#

sushantkumar23 / nano-gpt

Star

Simple character level Transformer

transformers pytorch attention attention-mechanism rope self-attention multi-head-attention shakespeare-dataset transformer-architecture llm rmsnorm

Updated May 27, 2024
Jupyter Notebook

kyegomez / MultiModalMamba

Sponsor

Star

A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.