30,000 Images - Natural Scenes OCR Data in Southeast Asian Languages

{"id":1758,"datatype":"1","titleimg":"/shujutang/static/image/index/datatang_tuxiang_default.jpg","type1":"147","type1str":null,"type2":"150","type2str":null,"dataname":"30,000 Images - Natural Scenes OCR Data in Southeast Asian Languages","datazy":[{"title":"Data size","desc":"Data size","content":"30,000 images, including 10,000 images in Khmer (Cambodia), 10,000 images in Lao, and 10,000 images in Burmese"},{"desc":"Collecting environment","content":"including slogan, receipt, poster, warning sign, road sign, food packaging, billboard, station sign and signboard, etc.","title":"Collecting environment"},{"desc":"Data diversity","content":"including a variety of natural scenes, multiple shooting angles","title":"Data diversity"},{"desc":"Device","content":"cellphone","title":"Device"},{"desc":"Photographic angle","content":"looking up angle, looking down angle, eye-level angle","title":"Photographic angle"},{"desc":"Data format","content":"the image format is common format such as.jpg, the annotation file format is .json","title":"Data format"},{"desc":"Annotation content","content":"line-level (column-level) quadrilateral bounding box annotation and transcription for the texts；polygon bounding box annotation and transcription for the texts","title":"Annotation content"},{"desc":"Accuracy rate","content":"the error bound of each vertex of quadrilateral or polygon bounding box is within 5 pixels, which is a qualified annotation, the accuracy of bounding boxes is not less than 95%; the texts transcription accuracy is not less than 95%.","title":"Accuracy rate"}],"datatag":"OCR,Southeast Asian Languages,Natural Scenes","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":"","samplePresentation":[{"name":"老挝2.jpg","url":"https://storage-product.datatang.com/damp/product/instructions_zh/20250408143051/%E8%80%81%E6%8C%9D2.jpg?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=s%2B1eh4MjnUp%2BNvrevGyGE8CYeks%3D","intro":"","size":2370056,"progress":100,"type":"jpg"},{"name":"柬埔寨3.jpg","url":"https://storage-product.datatang.com/damp/product/instructions_zh/20250408143051/%E6%9F%AC%E5%9F%94%E5%AF%A83.jpg?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=nkoeGAmA%2F2WQ0KLZU8gLpBpEuf4%3D","intro":"","size":192805,"progress":100,"type":"jpg"},{"name":"缅甸.jpg","url":"https://storage-product.datatang.com/damp/product/instructions_zh/20250408143051/%E7%BC%85%E7%94%B8.jpg?Expires=4102415999&OSSAccessKeyId=LTAI5tEBeSWUJiqjXvBMsxEu&Signature=E%2FxEnHde33VDqiwLCg%2Fjls2boS0%3D","intro":"","size":1161383,"progress":100,"type":"jpg"}],"officialSummary":"30,000 natural scene OCR data for minority languages in Southeast Asia, including Khmer (Cambodia), Lao and Burmese. The diversity of collection includes a variety of natural scenes and a variety of shooting angles. This set of data can be used for Southeast Asian language OCR tasks.","dataexampl":null,"datakeyword":"OCR,Southeast Asian Languages,Natural Scenes","isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Data Type,Language","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"ocr","BGimg":"","voiceBg":["/shujutang/static/image/comm/audio_bg.png","/shujutang/static/image/comm/audio_bg2.png","/shujutang/static/image/comm/audio_bg3.png","/shujutang/static/image/comm/audio_bg4.png","/shujutang/static/image/comm/audio_bg5.png"]}

Tel：03-6256-8911

データセット

コンピュータビジョンデータセット

音声認識データセット

音声合成データセット

OCRデータセット

発音辞書データセット

自然言語理解データセット

大規模モデルデータセット

データセット

Datatangは、20万時間の音声認識データと800TBのコンピュータビジョンデータ、20億件の自然言語理解データ、5TBのラベルなしテキストデータを持っています。版権クリアした高品質なデータセットによって、お客様からの信頼を得ています。
さらに見る
カスタマイズ

3D点群データ

ストリートビューデータ

OCRデータ

行動識別データ

ID識別データ

音声認識データ

音声合成データ

マルチモーダルデータ

カスタマイズ

Datatangは専門的なデータ収集設備とツールを持つとともに、3つの大型データアノテーション基地を設置。豊富な実績と完備なプロジェクト管理によって、お客様による様々なシーンや種類のデータカスタマイズニーズを満たし、パーソナライズされたデータ収集・アノテーションサービスにプロフェッショナルに対応します。
さらに見る
業界ソリューション

自動運転

エンターテインメント

カスタマーサービス

スマートホーム

ニューリテール

スマート医療

業界ソリューション

Datatangは10年以上にわたって様々な業界にデータサービスを提供してきた経験により、複数のビジネスシナリオのデータニーズを迅速に対応できます。独自のデータ収集・アノテーションラットフォームツールと自動化されたデータ処理機能により、マルチシナリオのデータソリューションを提供できます。
さらに見る
プラットフォーム

プラットフォーム
企業情報

会社概要

お知らせ

パートナー

お問い合わせ・販売

jp

数据解决方案

请输入姓名

携帯電話番号が無効です

連絡先を入力してください

会社名を入力してください

有効な仕事用電子メールを入力してください。

ご希望のデータについて入力してください

送信完了しました！ご協力ありがとうございました。

填写格式错误请重新填写

確認する

5文字以下、または数字のみでの入力は無効です。

https://www.datatang.co.jp

1758

_Data Products_Datatang

30,000 Images - Natural Scenes OCR Data in Southeast Asian Languages_30,000 Images - Natural Scenes OCR Data in Southeast Asian Languages

30,000 Images - Natural Scenes OCR Data in Southeast Asian Languages

ライセンス認証を経た製品データセットが、AIプロジェクトのスピーディーな立ち上げをアシストします。

30,000 natural scene OCR data for minority languages in Southeast Asia, including Khmer (Cambodia), Lao and Burmese. The diversity of collection includes a variety of natural scenes and a variety of shooting angles. This set of data can be used for Southeast Asian language OCR tasks.

お問い合わせサンプルを入手する

データ仕様

Data size: 30,000 images, including 10,000 images in Khmer (Cambodia), 10,000 images in Lao, and 10,000 images in Burmese

Collecting environment: including slogan, receipt, poster, warning sign, road sign, food packaging, billboard, station sign and signboard, etc.

Data diversity: including a variety of natural scenes, multiple shooting angles

Device: cellphone

Photographic angle: looking up angle, looking down angle, eye-level angle

Data format: the image format is common format such as.jpg, the annotation file format is .json

Annotation content: line-level (column-level) quadrilateral bounding box annotation and transcription for the texts；polygon bounding box annotation and transcription for the texts

Accuracy rate: the error bound of each vertex of quadrilateral or polygon bounding box is within 5 pixels, which is a qualified annotation, the accuracy of bounding boxes is not less than 95%; the texts transcription accuracy is not less than 95%.

サンプル紹介

収集対象者からの明確に許可を得た、高品質の製品トレーニングデータセットはが、AIプロジェクトのスピーディーな立ち上げをアシストします。

さっそく始めてみる

関連データのおすすめ

500,000 Images - Natural Scenes and Documents OCR Data

500,000 Images - Natural Scenes and Documents OCR Data

5,000 Images of Turkish Natural Scene OCR Data

5,000 Images of Turkish Natural Scene OCR Data

8,604 Images of Arabic Natural Scene OCR Data

8,604 Images of Arabic Natural Scene OCR Data

104,320 Images - Korean and Hindi OCR Data in Natural Scenes

104,320 Images - Korean and Hindi OCR Data in Natural Scenes

より高品質なデータ、より競争力のあるAI

データセット: コンピュータビジョンデータセット; 音声認識データセット; 音声合成データセット; OCRデータセット; 発音辞書データセット; 自然言語理解データセット; 大規模モデルデータセット

カスタマイズ: 3D点群データ; ストリートビューデータ; OCRデータ; 行動識別データ; ID識別データ; 音声認識データ; 音声合成データ; マルチモーダルデータ

業界ソリューション: 自動運転; エンターテインメント; カスタマーサービス; スマートホーム; ニューリテール; スマート医療

プラットフォーム: プラットフォーム

リソースセンター: 研究支援データセット; 高品質データ要件

リンク集: Datatang.cn; DataPlus; OPENMPD; カスタマーサービス

お問い合わせ: 本社〒101-0063 東京都千代田区神田淡路町2-105 ワテラスアネックス6階; 03-6256-8911; [email protected]

サイトマップ法的告知とプライバシー権

© 2021 Datatang Inc. All Rights Reserved.

このウェブサイトではサイトの利便性の向上を目的にCookieを使用します。パーソナライズされた広告やコンテンツを提供するとともに、Datatangのトラフィックを分析します。「すべて同意する」をクリックすると、DatatangによるCookieの使用に同意したものとみなされます。

Data Features

30,000 Images - Natural Scenes OCR Data in Southeast Asian Languages

*Name：

*Phone：

*Company：

*E-mail：

*Requirement：

30,000 Images - Natural Scenes OCR Data in Southeast Asian Languages