Changeset 872089 in git


Ignore:
Timestamp:
Jul 26, 2019, 4:13:00 PM (4 years ago)
Author:
Murray Heymann <heymann.murray@…>
Branches:
(u'spielwiese', 'd1ba061a762c62d3a25159d8da8b6e17332291fa')
Children:
55a5abbd0397943b2c98a4492b27ecbc050c20ef
Parents:
746d3db65a7b75cbc29407ee9382332483801f1d
Message:
Reorganize source tree, make test prediction
Files:
2 added
2 edited
1 moved

Legend:

Unmodified
Added
Removed
  • .gitignore

    r746d3d r872089  
    120120keywords.txt
    121121helpfiles
     122*.npy
  • machine_learning/common/lookuptable.py

    r746d3d r872089  
    11"""
     2j
    23A Module for fetching helpfiles, creating vectors for each and bundling
    34these up in a lookup table.
     
    1314
    1415# local imports
    15 from keyword_vector import count_occurances, read_dictionary
    16 from constants import HELP_FILE_URL, HELP_FILE_PATH, SINGULAR_BIN, \
     16from common.keyword_vector import count_occurances, read_dictionary
     17from common.constants import HELP_FILE_URL, HELP_FILE_PATH, SINGULAR_BIN, \
    1718                        EXTRACT_SCRIPT, KEYWORDS_FILE, HELPFILE_NPY, \
    1819                        VECTORS_NPY
     
    5152
    5253
    53 def create_table():
     54def create_table(dictionary=read_dictionary(KEYWORDS_FILE)):
    5455    """
    5556    Get a list of helpfiles, and generate a word occurance vector for each.
    5657    """
    57     vectors = []
    58     dictionary = read_dictionary(KEYWORDS_FILE)
     58    vectors = []
    5959
    6060    if not os.path.isfile(VECTORS_NPY) or not os.path.isfile(HELPFILE_NPY):
  • machine_learning/model/predictor.py

    r746d3d r872089  
    88
    99# Local imports
    10 from keyword_vector import vector_distance
     10from common.keyword_vector import vector_distance, count_occurances, \
     11                                    read_dictionary
     12from common.lookuptable import create_table
     13from common.constants import KEYWORDS_FILE
    1114
    1215
     
    8992    print(prediction)
    9093
     94    dictionary = read_dictionary(KEYWORDS_FILE)
     95    vectors, file_list = create_table(dictionary=dictionary)
     96    test_vec = count_occurances("extract.lib", dictionary)
     97    predictor.fit(vectors, file_list)
     98    prediction = predictor.predict(np.array([test_vec]))
     99    print(prediction)
     100
    91101if __name__ == '__main__':
    92102    main()
Note: See TracChangeset for help on using the changeset viewer.