markly

Markov chain for text generation
git clone git://git.yotsev.xyz/markly.git
Log | Files | Refs | README | LICENSE

README.md (3173B)


      1 ## markly
      2 
      3 A Markov chain implementation focused on text generation.
      4 
      5 ## Dependencies
      6 
      7 * boost libraries and headers
      8 
      9 To install them on arch:
     10 
     11     pacman -S boost-libs boost
     12 
     13 To install them on debian:
     14 
     15     apt install libboost-serialization-dev libboost-iostreams-dev
     16 
     17 ## Installation
     18 
     19     git clone git://git.yotsev.xyz/markly.git
     20     cd markly
     21     make install
     22 
     23 ## Usage examples
     24 
     25 Generating 10000 usernames at order 3 and length 10 that resemble the ones in
     26 names.txt:
     27 
     28     markly -s -f names.txt -o 3 -l 10 -n 10000
     29 
     30 Continuously generating usernames at order 3 and length 10:
     31 
     32     markly -s -f names.txt -o 3 -l 10 -C
     33 
     34 Continuously generating usernames at order 3 and maximum length:
     35 
     36     markly -s -f names.txt -o 3 -m -C
     37 
     38 Generating a continuous username by resetting the gram when nothing follows:
     39 
     40     markly -s -f names.txt -o 3 -m -C > username.tmp
     41     cat username.tmp | tr -d '\n' > username
     42     rm username.tmp
     43 
     44 Generating continuous text resembling that of book.txt:
     45 
     46     markly -f book.txt -o 5 -m -C
     47 
     48 Saving a compiled chain for later use:
     49 
     50     markly -g book.txt -o 3
     51     # produces chain.o3
     52     markly -s -g names.txt names -o 3
     53     # produces names.o3
     54 
     55 Using a compiled chain:
     56 
     57     markly -c chain.o3 -n 5 -m
     58     markly -c names.o3 -n 10000 -l 10
     59 
     60 ## Options
     61 
     62 `-s`
     63 
     64 Short form. Useful for text comprising of short strings on separate lines.
     65 It takes the beginnings of such strings and stores them in a separate array
     66 from the rest of the grams. When generating a new string, a random beginning
     67 from the array is taken and expanded on using the chain. Use this if all the
     68 output strings start with "a" or something similarly predictable.
     69 
     70 `-f [text.txt]`
     71 
     72 File. Select a file from which the to compiling a Markov chain
     73 and output text. For saving the chain see `-g`.
     74 
     75 `-g text.txt [nameOfChain]`
     76 
     77 Generate. In addition to compiling the chain, it
     78 saves it to nameOfChain.o3 where "o3" is added automatically depending on the
     79 specified order (see `-o`). When `nameOfChain` is omitted and there are
     80 supplied arguments afterwards, the chain is saved in chain.o3 with the same
     81 handling of the extension. By default there is no output but there is no
     82 restriction on it, you can still make the program generate text by passing
     83 `-n`/`-C` and `-l`/`-m`.
     84 
     85 `-c nameOfChain.o3`
     86 
     87 Chain. Selects a compiled chain file for the text
     88 generation. When loading a chain, specifying the order doesn't work because the
     89 chain is already compiled with a set order. The order is also automatically read
     90 from the chain so the whole argument can be omitted. The ".o3" extension
     91 doesn't play a role in the reading of the order, it's just there to help the
     92 user. The "3" in ".o3" is the order of the chain. The file extension should be
     93 read as "order 3".
     94 
     95 `-o order`
     96 
     97 Order. Specifies the order of the compiled chain.
     98 
     99 `-n iterations`
    100 
    101 Iterations. Specifies number of iterations.
    102 
    103 `-C`
    104 
    105 Continuous. Infinite iterations.
    106 
    107 `-l length`
    108 
    109 Length. Specifies the length of the generated strings in characters.
    110 
    111 `-m`
    112 
    113 Maximum length. Generates text untill it reaches a gram with no following
    114 characters.
    115 
    116 `-v`
    117 
    118 Verbose. It prints some status messages to standard error.
    119 
    120 ## License
    121 
    122 GPLv2