flickr30k

Here are 16 public repositories matching this topic...

eric-ai-lab / ComCLIP

Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"

causality clip svo slip vision-and-language compositionality flickr8k-dataset image-text-matching flickr30k image-text-retrieval winoground blip2

Updated Aug 18, 2024
Python

awsaf49 / flickr-dataset

Star

Download flickr8k, flickr30k image caption datasets

image flickr dataset clip captioning-images image-text flickr8k flickr30k siglip

Updated Feb 6, 2024

nirajankarki5 / Flickr30k-Image-Caption-Generator-Using-Deep-Learning

Star

A deep learning model that generates descriptions of an image.

machine-learning deep-learning caption-generation flickr30k

Updated Mar 11, 2021
Jupyter Notebook

nssharmaofficial / image-caption-generator

Sponsor

Star

Image captioning model with Resnet50 encoder and LSTM decoder

encoder decoder pytorch embeddings lstm image-captioning vocabulary-builder resnet50 image-caption-generator flickr30k

Updated Sep 6, 2024
Python

KimRass / CLIP

Star

PyTorch implementation of 'CLIP' (Radford et al., 2021) from scratch and training it on Flickr8k + Flickr30k

multi-modal clip linear-classification flickr8k zero-shot-classification flickr30k text-image-retrieval

Updated Mar 14, 2024
Python

thisisankit27 / SnapSpeak

Star

Visual Elocution Synthesis

docker tesseract-ocr image-captioning flickr30k

Updated Mar 29, 2024
Python

Delphboy / karpathy-splits

Star

Karpathy Splits json files for image captioning

image-caption mscoco-dataset flickr8k-dataset flickr30k karpathy-split

Updated Apr 4, 2024

Sh-31 / ImgCap

Star

ImgCap is an image captioning model designed to automatically generate descriptive captions for images. It has two versions CNN + LSTM model and CNN + LSTM + Attention mechanism model.

torch lstm beam-search resnet deeplearning imagecaptioning torchtext torchvision flickr30k

Updated Sep 10, 2024
Python

HanCai98 / Flickr30k-Dataset

Star

Preprocess the Flickr30k dataset

data-preprocessing flickr30k

Updated Dec 7, 2021
Python

adas0910 / densecap-flickr30K-entities

Star

Processing data produced by flickr30k_entities to use as regional description for densecap model

python json image-captioning h5 densecap flickr30k regional-description

Updated Nov 11, 2022
Python

spoortimorabad / ImageCaptioningGeneration-Using-Swin-Transformer-and-GRU-attention-Mechansim

Star

Image captioning generation using Swin transformer and GRU attention mechanism

tensorflow captions gru mit-license imagecaptioning swin-transformer flickr30k

Updated Oct 8, 2024
Jupyter Notebook

SaharZargarzadeh / ImageCaptioning-Transformer-EfficientNet

Star

Image captioning model using EfficientNetB0 as encoder and a custom Transformer decoder, trained on the Flickr30k dataset. Demonstrates full model architecture, preprocessing, and BLEU-based evaluation in TensorFlow. Built as an educational resource to explain Transformer architecture step-by-step.

deep-learning tensorflow kaggle transformer attention image-captioning bleu-score vision-language efficientnet flickr30k

Updated Jun 20, 2025
Jupyter Notebook

bkhanal-11 / clip-openai

Star

Implementation of CLIP from OpenAI using pretrained Image and Text Encoders.

vit clip flickr30k all-mpnet-base-v2

Updated Dec 12, 2023
Jupyter Notebook

kumarsantosh04 / image-captioning

Star

Attention Based image captioning

computer-vision lstm image-captioning transfer-learning attention-mechanism encoder-decoder flickr30k

Updated Dec 27, 2024
Python

spoluan / flickr30k-image-captioning

Star

"Flickr30k_image_captioning" is a project or repository focused on image captioning using the Flickr30k dataset. The project aims to develop and showcase algorithms and models that generate descriptive captions for images.

nlp computer-vision deep-learning language-modeling cnn neural-networks image-recognition image-captioning sequence transfer-learning datasets image-analysis attention-mechanism encoder-decoder caption-generation flickr30k image-to-text-generation

Updated May 2, 2023
Jupyter Notebook

NafisSaleh / AI-Accessibility-Extension-For-Visually-Impaired-Users

Star

This project aims to provide real-time visual-to-audio conversion, empowering visually impaired users by describing images through generated captions and synthesized audio. The system employs a Transformer-based image captioning model and integrates with a browser extension for seamless functionality.

nlp text-to-speech deep-learning accessibility tensorflow cnn transformer ngrok image-captioning browser-extension deeplearning nlp-machine-learning accessibility-automation fastapi efficientnet flickr30k

Updated Nov 28, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the flickr30k topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the flickr30k topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flickr30k

Here are 16 public repositories matching this topic...

eric-ai-lab / ComCLIP

awsaf49 / flickr-dataset

nirajankarki5 / Flickr30k-Image-Caption-Generator-Using-Deep-Learning

nssharmaofficial / image-caption-generator

KimRass / CLIP

thisisankit27 / SnapSpeak

Delphboy / karpathy-splits

Sh-31 / ImgCap

HanCai98 / Flickr30k-Dataset

adas0910 / densecap-flickr30K-entities

spoortimorabad / ImageCaptioningGeneration-Using-Swin-Transformer-and-GRU-attention-Mechansim

SaharZargarzadeh / ImageCaptioning-Transformer-EfficientNet

bkhanal-11 / clip-openai

kumarsantosh04 / image-captioning

spoluan / flickr30k-image-captioning

NafisSaleh / AI-Accessibility-Extension-For-Visually-Impaired-Users

Improve this page

Add this topic to your repo