Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • T texta-mlp-python
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 19
    • Issues 19
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • texta
  • texta-mlp-python
  • Issues
  • #4

Closed
Open
Created Apr 16, 2020 by Kristiina Vaik@kristiina_vaik

Split input document into paragraphs

  1. Document should be split on "\n\n", result: n paragraphs per document
  2. Detect the language of each paragraph
  3. Apply the correct MLP pipeline according to the detected language of each paragraph
  4. Put the entire document back together
  5. Adjust the initial fact spans
Assignee
Assign to
Time tracking