A class for viewing Term-Frequency model.
def vsm.viewer.tfviewer.TfViewer.__init__ |
( |
|
self, |
|
|
|
corpus, |
|
|
|
model |
|
) |
| |
Initialize TfViewer.
:param corpus: Source of observed data.
:type corpus: :class:`Corpus`
:param model: A Term-Frequency model.
:type model: TfSeq or TfMulti object.
def vsm.viewer.tfviewer.TfViewer.coll_freq |
( |
|
self, |
|
|
|
word |
|
) |
| |
Returns the frequency of `word` in all documents.
:param word: Word to which its frequency is retrieved.
:type word: string or integer
:returns: freqency as integer
def vsm.viewer.tfviewer.TfViewer.coll_freqs |
( |
|
self, |
|
|
|
print_len = 20 , |
|
|
|
as_strings = True |
|
) |
| |
Returns the frequency of all words in all documents.
:param print_len: Length of words to display. Default is 20.
:type print_len: integer, optional
:param as_strings: If `True`, words are represented as strings
rather than their integer representation.
:type as_strings: boolean, optional
:returns: an instance of :class:`LabeledColumn`.
A table with words and their frequencies.
def vsm.viewer.tfviewer.TfViewer.dismat_doc |
( |
|
self, |
|
|
|
doc_list, |
|
|
|
dist_fn = angle_sparse |
|
) |
| |
Calculates a distance matrix for a given list of documents.
:param doc_list: A list of documents whose distance matrix is to be
computed.
:type docs: list of strings/integers.
:param dist_fn: A distance function from functions in vsm.spatial.
Default is :meth:`angle_sparse`.
:type dist_fn: string, optional
:returns: an instance of :class:`IndexedSymmArray`.
n x n matrix containing floats where n is the number of
documents in `doc_list`.
:See Also: :meth:`vsm.viewer.wrappers.dismat_doc`
def vsm.viewer.tfviewer.TfViewer.dismat_word |
( |
|
self, |
|
|
|
word_list, |
|
|
|
dist_fn = angle_sparse |
|
) |
| |
Calculates a distance matrix for a given list of words.
:param word_list: A list of words whose distance matrix is to be
computed.
:type word_list: list strings/integers.
:param dist_fn: A distance function from functions in vsm.spatial.
Default is :meth:`angle_sparse`.
:type dist_fn: string, optional
:returns: an instance of :class:`IndexedSymmArray`.
n x n matrix containing floats where n is the number of words
in `word_list`.
:See Also: :meth:`vsm.viewer.wrappers.dismat_word`
def vsm.viewer.tfviewer.TfViewer.dist_doc_doc |
( |
|
self, |
|
|
|
doc_or_docs, |
|
|
|
weights = [] , |
|
|
|
print_len = 10 , |
|
|
|
filter_nan = True , |
|
|
|
label_fn = def_label_fn , |
|
|
|
as_strings = True , |
|
|
|
dist_fn = angle_sparse , |
|
|
|
order = 'i' |
|
) |
| |
Computes and sorts the distances between a document or list of documents
and every document.
:param doc_or_docs: Query document(s) to which distances are calculated.
:type doc_or_docs: string/integer or list of strings/integers
:param weights: Specify weights for each query doc in `doc_or_docs`.
Default uses equal weights (i.e. arithmetic mean)
:type weights: list of floating point, optional
:param print_len: Number of documents to be displayed. Default is 10.
:type print_len: int, optional
:param filter_nan: If `True` not a number entries are filtered.
Default is `True`.
:type filter_nan: boolean, optional
:param label_fn: A function that defines how documents are represented.
Default is :meth:`def_label_fn` which retrieves the labels
from corpus metadata.
:type label_fn: string, optional
:param as_strings: If `True`, returns a list of documents as strings
rather than indices. Default is `True`.
:type as_strings: boolean, optional
:param dist_fn: A distance function from functions in vsm.spatial.
Default is :meth:`angle_sparse`.
:type dist_fn: string, optional
:param order: Order of sorting. 'i' for increasing and 'd' for
decreasing order. Default is 'i'.
:type order: string, optional
:returns: an instance of :class:`LabeledColumn`.
A 2-dim array containing documents and their distances to
`doc_or_docs`.
:See Also: :meth:`vsm.viewer.wrappers.dist_doc_doc`
def vsm.viewer.tfviewer.TfViewer.dist_word_doc |
( |
|
self, |
|
|
|
word_or_words, |
|
|
|
weights = [] , |
|
|
|
label_fn = def_label_fn , |
|
|
|
filter_nan = True , |
|
|
|
print_len = 10 , |
|
|
|
as_strings = True , |
|
|
|
dist_fn = angle_sparse , |
|
|
|
order = 'i' |
|
) |
| |
Computes and sorts distances between a word or a list of words to
every document.
:param word_or_words: Query word(s) to which a pseudo-document is
created for computation of distances.
:type word_or_words: string/integer or list of strings/integers
:param weights: Specify weights for each query doc in `word_or_words`.
Default uses equal weights (i.e. arithmetic mean)
:type weights: list of floating point, optional
:param print_len: Number of documents to be displayed. Default is 10.
:type print_len: int, optional
:param filter_nan: If `True` not a number entries are filtered.
Default is `True`.
:type filter_nan: boolean, optional
:param label_fn: A function that defines how documents are represented.
Default is :meth:`def_label_fn` which retrieves the labels
from corpus metadata.
:type label_fn: string, optional
:param as_strings: If `True`, returns a list of documents as strings
rather than indices. Default is `True`.
:type as_strings: boolean, optional
:param dist_fn: A distance function from functions in vsm.spatial.
Default is :meth:`angle_sparse`.
:type dist_fn: string, optional
:param order: Order of sorting. 'i' for increasing and 'd' for decreasing
order. Default is 'i'.
:type order: string, optional
:returns: an instance of :class:`LabeledColumn`.
A 2-dim array containing documents and their distances to
`word_or_words`.
:See Also: :meth:`vsm.viewer.wrappers.dist_word_doc`
def vsm.viewer.tfviewer.TfViewer.dist_word_word |
( |
|
self, |
|
|
|
word_or_words, |
|
|
|
weights = [] , |
|
|
|
filter_nan = True , |
|
|
|
print_len = 10 , |
|
|
|
as_strings = True , |
|
|
|
dist_fn = angle_sparse , |
|
|
|
order = 'i' |
|
) |
| |
Returns words sorted by the distances between word(s) and every word.
:param word_or_words: Query word(s) to which distances are calculated.
:type word_or_words: string or list of strings
:param weights: Specify weights for each query word in `word_or_words`.
Default uses equal weights (i.e. arithmetic mean)
:type weights: list of floating point, optional
:param filter_nan: If `True` not a number entries are filtered.
Default is `True`.
:type filter_nan: boolean, optional
:param print_len: Number of words to be displayed. Default is 10.
:type print_len: int, optional
:param as_strings: If `True`, returns a list of words as strings rather
than their integer representations. Default is `True`.
:type as_strings: boolean, optional
:param dist_fn: A distance function from functions in vsm.spatial.
Default is :meth:`angle_sparse`.
:type dist_fn: string, optional
:param order: Order of sorting. 'i' for increasing and 'd' for
decreasing order. Default is 'i'.
:type order: string, optional
:returns: an instance of :class:`LabeledColumn`.
A 2-dim array containing words and their distances to
`word_or_words`.
:See Also: :meth:`vsm.viewer.wrappers.dist_word_word`
La documentación para esta clase fue generada a partir del siguiente fichero:
- vsm/vsm/viewer/tfviewer.py