DATAKU
  • DATAKU
    • Introduction
    • Extract Data from Content
    • View Processed Content
  • API Documentation
Powered by GitBook
On this page
  • Overview
  • Features
  • How to Use
  • Providing Content
  • Defining Extraction Criteria
  • Extraction Process
  • Tips and Best Practices
  • Troubleshooting
  • Support
  1. DATAKU

Extract Data from Content

Overview

The extraction page is designed for extracting structured information from various input types: text, documents (like DOC, DOCX, TXT, PDF), and tables (CSV format). It facilitates easy upload and parsing of data, allowing users to define custom extraction criteria.

Features

  1. Multiple Input Types: Supports text input, file upload (documents), and CSV table upload.

  2. Custom Extraction Schema: Users can define a custom schema for data extraction, specifying field names and descriptions.

  3. File Validation: Ensures file type and size validation for uploads.

  4. AI Schema Detection: Offers an AI-based option to automatically define the extraction schema.

  5. Responsive UI: Provides a user-friendly interface adaptable to various screen sizes.

How to Use

Providing Content

  1. Select Input Type: Choose between text, file, or table.

  2. Input Data:

    • For text: Paste the text in the provided textarea.

    • For files: Drag and drop or browse to upload document files.

    • For tables: Drag and drop or browse to upload a single-column CSV file.

Defining Extraction Criteria

  1. Add Schema Fields: Click 'Add' to create more fields in your extraction schema.

  2. Enter Field Details: Provide a name and an optional description for each field.

  3. AI-Define: Use the AI-Define feature to automatically suggest a schema based on your input.

Extraction Process

  1. Click 'Extract': Once the data and schema are set, click 'Extract' to initiate the extraction process.

  2. View Results: The extracted data is displayed in a table format under the extraction sections.

  3. Dowloand and Copy: You may copy or download the extracted data in CSV format by clicking the buttons at the top right corner.

Tips and Best Practices

  • Ensure file formats and sizes are within the specified limits for successful uploads.

  • Utilize the AI-Define feature for efficient schema generation, especially for large or complex datasets.

  • Regularly review and update the schema for precision in data extraction.

Troubleshooting

  • File Upload Errors: Check if the file size exceeds 20MB or if the file format is unsupported.

  • Extraction Errors: Ensure that the schema fields correctly correspond to the data format and content. The error may also occur when the input content is too large or the schema is too complicated.

Support

For further assistance or to report issues, contact support@dataku.ai

PreviousIntroductionNextView Processed Content

Last updated 1 year ago