Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
install_cqp_mac [2006/12/08 22:12] – emiliano | install_cqp_mac [2006/12/08 22:32] (current) – emiliano | ||
---|---|---|---|
Line 79: | Line 79: | ||
Then you must set your environment, | Then you must set your environment, | ||
- | ((If you try putting your '' | + | ((If you try putting your '' |
+ | |||
+ | This is what happened to me after putting my registry in '' | ||
< | < | ||
Line 88: | Line 90: | ||
After that message, I had to move everything to / (root). If you follow the instructions above, you shouldn' | After that message, I had to move everything to / (root). If you follow the instructions above, you shouldn' | ||
- | * if you use the TCSH shell:\\ setenv CORPUS_REGISTRY "/ | + | * if you use the TCSH shell: |
- | * if you use the BASH shell:\\ export CORPUS_REGISTRY="/ | + | |
+ | | ||
+ | |||
+ | * if you use the BASH shell: | ||
+ | |||
+ | | ||
===== Installing a corpus ===== | ===== Installing a corpus ===== | ||
- | If you receive a corpus that is already encoded with cwb-encode (like the demo corpora), you will most probably | + | If you receive a corpus that is already encoded with '' |
- | * Rename data/ to some thing more interesting (" | + | * Rename |
- | * Move the renamed data/ directory into /corpora. | + | * Move the renamed |
- | * Move the content of registry/ into / | + | * Move the content of '' |
- | Let's say you are installing the DICKENS demo corpus. Let's say you now have the following situation in you /corpora directory: | + | Let's say you are installing the DICKENS demo corpus. Let's say you now have the following situation in you '' |
< | < | ||
Line 116: | Line 124: | ||
</ | </ | ||
- | Now browse into the / | + | Now browse into the '' |
< | < | ||
Line 129: | Line 137: | ||
</ | </ | ||
- | Do not touch anything, except the line defining " | + | Do not touch anything, except the line defining " |
- | In our case, we replace " | + | |
- | If you go back to the terminal, you will now be able to type the command cqp and use the installed corpus: | + | * Replace '' |
+ | * In our case, we replace '' | ||
+ | * Save the file. | ||
+ | |||
+ | If you go back to the terminal, you will now be able to type the command | ||
< | < | ||
Line 148: | Line 159: | ||
4369: und resounded through the < | 4369: und resounded through the < | ||
| | ||
- | [...] | + | [...] |
</ | </ | ||
+ | |||
===== Encoding a corpus ===== | ===== Encoding a corpus ===== | ||
Line 159: | Line 171: | ||
Make sure your corpus is formatted one token per line, as indicated in the " | Make sure your corpus is formatted one token per line, as indicated in the " | ||
- | If your corpus counts more than one file, it is advisable that you put all the files together in just one gzipped archive | + | If your corpus counts more than one file, it is advisable that you put all the files together in just one gzipped archive, e.g. using something like: |
- | gzip -c *.txt > newcorpus.gz). | + | |
+ | | ||
==== Import " | ==== Import " | ||
- | Create a directory / | + | Create a directory |
- | Browse to the directory where your corpus | + | Browse to the directory where your corpus-files are stored. |
<code bash> | <code bash> | ||
Line 172: | Line 185: | ||
</ | </ | ||
- | Issue the cwb-encode command, remembering that your encoded data will " | + | Issue the '' |
- | You will also have to define the "-P" | + | You will also have to define the '' |
- | characteristics of newcorpus. We are using a simple example: | + | |
<code bash> | <code bash> | ||
Line 185: | Line 197: | ||
==== Index " | ==== Index " | ||
- | If you gotten this far, then you're almost done. You are just missing the indexes for cqp to be able to use the imported data. You will need to issue | + | If you' |
- | just one command: cwb-makeall -V NEWCORPUS. Beware: type " | + | |
+ | You are just missing the indexes for '' | ||
+ | |||
+ | You will need to issue just one command: | ||
+ | |||
+ | Beware: type " | ||
<code bash> | <code bash> |