300 million pairs of high-quality image-caption dataset

{"id":1451,"datatype":"1","titleimg":"/shujutang/static/image/index/datatang_wenben_default.jpg","type1":"226","type1str":null,"type2":"254","type2str":null,"dataname":"300 million pairs of high-quality image-caption dataset","datazy":[{"title":"Data size","desc":"Data size","content":"300 million images, each paired with a textual description. Complete image library (including photographic + vector images) totals nearly 300 million, Full dataset available for generative AI training (curated photographic + vector images excluding editorial/news images) comprises approximately 100 million."},{"desc":"Data formats","content":"Image formats: .jpg, .png, .svg; Description format: .txt","title":"Data formats"},{"desc":"Data content","content":"Original copyrighted image works officially released by creators, accompanying descriptions authored by content creators.","title":"Data content"},{"desc":"Data types","content":"Photographic images and vector illustrations, covers diverse scene categories.","title":"Data types"},{"desc":"Data resolution","content":"4K and above","title":"Data resolution"},{"desc":"Description languages","content":"Predominantly English (majority), Minimal Chinese portion.","title":"Description languages"}],"datatag":"","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":"","samplePresentation":[],"officialSummary":"300 million images, each corresponding to a description. All are genuine image works published by photographers. The vast majority of descriptions are in English, with very few in Chinese.","dataexampl":null,"datakeyword":"","isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Type","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"llm","BGimg":"","voiceBg":["/shujutang/static/image/comm/audio_bg.png","/shujutang/static/image/comm/audio_bg2.png","/shujutang/static/image/comm/audio_bg3.png","/shujutang/static/image/comm/audio_bg4.png","/shujutang/static/image/comm/audio_bg5.png"]}

Tel：03-6256-8911

日本語

データセット

コンピュータビジョンデータセット

音声認識データセット

音声合成データセット

OCRデータセット

発音辞書データセット

自然言語理解データセット

大規模モデルデータセット

データセット

Datatangは、20万時間の音声認識データと800TBのコンピュータビジョンデータ、20億件の自然言語理解データ、5TBのラベルなしテキストデータを持っています。版権クリアした高品質なデータセットによって、お客様からの信頼を得ています。
さらに見る
カスタマイズ

3D点群データ

ストリートビューデータ

OCRデータ

行動識別データ

ID識別データ

音声認識データ

音声合成データ

マルチモーダルデータ

カスタマイズ

Datatangは専門的なデータ収集設備とツールを持つとともに、3つの大型データアノテーション基地を設置。豊富な実績と完備なプロジェクト管理によって、お客様による様々なシーンや種類のデータカスタマイズニーズを満たし、パーソナライズされたデータ収集・アノテーションサービスにプロフェッショナルに対応します。
さらに見る
業界ソリューション

自動運転

エンターテインメント

カスタマーサービス

スマートホーム

ニューリテール

スマート医療

業界ソリューション

Datatangは10年以上にわたって様々な業界にデータサービスを提供してきた経験により、複数のビジネスシナリオのデータニーズを迅速に対応できます。独自のデータ収集・アノテーションラットフォームツールと自動化されたデータ処理機能により、マルチシナリオのデータソリューションを提供できます。
さらに見る
プラットフォーム

プラットフォーム
企業情報

会社概要

お知らせ

パートナー

お問い合わせ・販売

jp

数据解决方案

请输入姓名

携帯電話番号が無効です

連絡先を入力してください

会社名を入力してください

有効な仕事用電子メールを入力してください。

ご希望のデータについて入力してください

送信完了しました！ご協力ありがとうございました。

填写格式错误请重新填写

確認する

5文字以下、または数字のみでの入力は無効です。

https://www.datatang.co.jp

1451

_Data Products_Datatang

300 million pairs of high-quality image-caption dataset_300 million pairs of high-quality image-caption dataset

300 million pairs of high-quality image-caption dataset

ライセンス認証を経た製品データセットが、AIプロジェクトのスピーディーな立ち上げをアシストします。

300 million images, each corresponding to a description. All are genuine image works published by photographers. The vast majority of descriptions are in English, with very few in Chinese.

お問い合わせサンプルを入手する

データ仕様

Data size: 300 million images, each paired with a textual description. Complete image library (including photographic + vector images) totals nearly 300 million, Full dataset available for generative AI training (curated photographic + vector images excluding editorial/news images) comprises approximately 100 million.

Data formats: Image formats: .jpg, .png, .svg; Description format: .txt

Data content: Original copyrighted image works officially released by creators, accompanying descriptions authored by content creators.

Data types: Photographic images and vector illustrations, covers diverse scene categories.

Data resolution: 4K and above

Description languages: Predominantly English (majority), Minimal Chinese portion.

サンプル紹介

収集対象者からの明確に許可を得た、高品質の製品トレーニングデータセットはが、AIプロジェクトのスピーディーな立ち上げをアシストします。

さっそく始めてみる

関連データのおすすめ

30 Million High-quality Video Data

30 Million High-quality Video Data

80 Million Vector Image Data

80 Million Vector Image Data

200 Million High-quality Image Data

200 Million High-quality Image Data

7 Million Sets - High-Quality Video Caption Dataset

7 Million Sets - High-Quality Video Caption Dataset

より高品質なデータ、より競争力のあるAI

データセット: コンピュータビジョンデータセット; 音声認識データセット; 音声合成データセット; OCRデータセット; 発音辞書データセット; 自然言語理解データセット; 大規模モデルデータセット

カスタマイズ: 3D点群データ; ストリートビューデータ; OCRデータ; 行動識別データ; ID識別データ; 音声認識データ; 音声合成データ; マルチモーダルデータ

業界ソリューション: 自動運転; エンターテインメント; カスタマーサービス; スマートホーム; ニューリテール; スマート医療

プラットフォーム: プラットフォーム

リソースセンター: 研究支援データセット; 高品質データ要件

リンク集: Datatang.cn; DataPlus; OPENMPD; カスタマーサービス

お問い合わせ: 本社〒101-0063 東京都千代田区神田淡路町2-105 ワテラスアネックス6階; 03-6256-8911; info@datatang.co.jp

サイトマップ法的告知とプライバシー権

© 2021 Datatang Inc. All Rights Reserved.

このウェブサイトではサイトの利便性の向上を目的にCookieを使用します。パーソナライズされた広告やコンテンツを提供するとともに、Datatangのトラフィックを分析します。「すべて同意する」をクリックすると、DatatangによるCookieの使用に同意したものとみなされます。

Data Features

300 million pairs of high-quality image-caption dataset

*Name：

*Phone：

*Company：

*E-mail：

*Requirement：

300 million pairs of high-quality image-caption dataset