Apertium linguistic data for Spanish
Find a file
2025-10-31 14:21:32 +01:00
.github/workflows Create monolingual.yml 2021-12-10 18:20:42 +01:00
corpus corpus de prueba (10000 frases de Taboeba) 2018-04-05 06:47:01 +03:00
dev find-duplicates.sh 2022-08-07 21:55:53 +02:00
tagger-data Dades del tagger a punt 2025-04-03 23:33:19 +02:00
test Update grep test 2025-03-16 12:59:47 +01:00
.gitattributes Update .gitignore 2025-04-01 23:45:48 +02:00
.gitignore Update .gitignore 2025-04-01 23:45:48 +02:00
apertium-spa.pc.in Remove lib from data-only pkg-config 2016-02-01 20:23:16 +00:00
apertium-spa.post-spa.dix comment out postgeneration rules: DE, A, + EL 2019-09-09 11:00:50 +02:00
apertium-spa.spa.acx add dir 2014-03-20 18:38:27 +00:00
apertium-spa.spa.lrx [spa] More fixes 2017-01-31 23:06:26 +00:00
apertium-spa.spa.lrx.todo.txt [spa] Disamb. to-do 2016-10-26 05:26:52 +00:00
apertium-spa.spa.metadix ecólogo y ecoansiedad 2025-10-31 14:21:32 +01:00
apertium-spa.spa.rlx apartado correos 2025-07-25 12:58:50 +02:00
apertium-spa.spa.tsx preparació del tagger 2025-04-03 18:20:01 +02:00
AUTHORS Release 1.2.1 2021-01-08 21:55:07 +01:00
autogen.sh Put .pc and compiled data in /share/, not /lib/ (following Debian/autotools standards) 2014-06-09 18:37:10 +00:00
ChangeLog add dir 2014-03-20 18:38:27 +00:00
configure.ac Bump version 1.3.0 - release is in a branch 2023-05-10 14:21:51 +00:00
convert-metadix-dix.py Update convert-metadix-dix.py 2024-07-07 19:11:33 +02:00
COPYING License GPLv3 -> GPLv2 for now 2017-03-27 10:21:52 +00:00
Makefile.am setup apertium-regtest 2021-07-19 12:00:37 -05:00
modes.xml Fix tagger modes 2025-05-18 00:07:00 +02:00
NEWS add dir 2014-03-20 18:38:27 +00:00
README Update README 2025-04-03 23:39:56 +02:00
README.md Symlink 2018-03-08 15:17:15 +00:00
spa.prob add dir 2014-03-20 18:38:27 +00:00
tagger.supervised.make Rebuild dic if TSX changes 2025-04-05 18:11:25 +02:00
tagger.unsupervised.make Rebuild dic if TSX changes 2025-04-05 18:11:25 +02:00

Spanish: apertium-spa

This is an Apertium monolingual language package for Spanish. What you can use this language package for:

  • Morphological analysis of Spanish
  • Morphological generation of Spanish
  • Part-of-speech tagging of Spanish

Requirements

You will need the following software installed:

  • lttoolbox (>= 3.7.1)
  • apertium (>= 3.8.3)
  • vislcg3 (>= 1.3.9)

If this does not make any sense, we recommend you look at: https://apertium.org

Compiling

Given the requirements being installed, you should be able to just run:

$ ./autogen.sh
$ make

If you're doing development, you don't have to install the data, you can use it directly from this directory.

If you are installing this language package as a prerequisite for an Apertium translation pair, then do (typically as root / with sudo):

# make install

You can give a --prefix to ./autogen.sh to install as a non-root user, but make sure to use the same prefix when installing the translation pair and any other language packages.

If any of this doesn't make sense or doesn't work, see https://wiki.apertium.org/wiki/Install_language_data_by_compiling

Testing

If you are in the source directory after running make, the following commands should work:

$ echo "Voy a la playa" | apertium -d . spa-morph
^Voy/ir<vblex><pri><p1><sg>$ ^a/a<pr>$ ^la/el<det><def><f><sg>/lo<prn><pro><p3><f><sg>$ 
^playa/playa<n><f><sg>$^./.<sent>$

$ echo "Voy a la playa" | apertium -d . spa-tagger
^ir<vblex><pri><p1><sg>$ ^a<pr>$ ^el<det><def><f><sg>$ ^playa<n><f><sg>$^.<sent>$

Tagger model training

To train the tagger model, do one of the following:

Supervised training:

$ make -f tagger.supervised.make

Unsupervised training

$ make -f tagger.unsupervised.make

For details on the corpora used in training, check the corpora information.

For more information, see https://wiki.apertium.org/wiki/Tagger_training

Files and data

For more information

Help and support

If you need help using this language pair or data, you can contact:

See also the file AUTHORS included in this distribution.