SIGMOD Contest 2013
|
Go to the source code of this file.
Macros | |
#define | MAX_DOC_LENGTH (1<<22) |
Maximum document length in characters. | |
#define | MAX_WORD_LENGTH 31 |
Maximum word length in characters. | |
#define | MIN_WORD_LENGTH 4 |
Minimum word length in characters. | |
#define | MAX_QUERY_WORDS 5 |
Maximum number of words in a query. | |
#define | MAX_QUERY_LENGTH ((MAX_WORD_LENGTH+1)*MAX_QUERY_WORDS) |
Maximum query length in characters. | |
Typedefs | |
typedef unsigned int | QueryID |
Query ID type. | |
typedef unsigned int | DocID |
Document ID type. | |
Enumerations | |
enum | MatchType { MT_EXACT_MATCH , MT_HAMMING_DIST , MT_EDIT_DIST } |
Matching types: More... | |
enum | ErrorCode { EC_SUCCESS , EC_NO_AVAIL_RES , EC_FAIL } |
Error codes: More... | |
Functions | |
ErrorCode | InitializeIndex () |
Called only once at the beginning of the whole test. | |
ErrorCode | DestroyIndex () |
Called only once at the end of the whole test. | |
ErrorCode | StartQuery (QueryID query_id, const char *query_str, MatchType match_type, unsigned int match_dist) |
Add a query (associated with matching type) to the active query set. | |
ErrorCode | EndQuery (QueryID query_id) |
Remove a query from the active query set. | |
ErrorCode | MatchDocument (DocID doc_id, const char *doc_str) |
Push a document to the server. | |
ErrorCode | GetNextAvailRes (DocID *p_doc_id, unsigned int *p_num_res, QueryID **p_query_ids) |
Return the next available active queries subset that matches any previously submitted document, sorted by query IDs. | |
#define MAX_DOC_LENGTH (1<<22) |
Maximum document length in characters.
#define MAX_QUERY_LENGTH ((MAX_WORD_LENGTH+1)*MAX_QUERY_WORDS) |
Maximum query length in characters.
#define MAX_QUERY_WORDS 5 |
Maximum number of words in a query.
#define MAX_WORD_LENGTH 31 |
Maximum word length in characters.
#define MIN_WORD_LENGTH 4 |
Minimum word length in characters.
typedef unsigned int DocID |
Document ID type.
typedef unsigned int QueryID |
Query ID type.
enum ErrorCode |
Error codes:
Enumerator | |
---|---|
EC_SUCCESS | Must be returned by each core function unless specified otherwise. |
EC_NO_AVAIL_RES | Must be returned only if there is no available result to be returned by GetNextAvailRes(). That is, all results have already been returned via previous calls to GetNextAvailRes(). |
EC_FAIL | Used only for debugging purposes, and must not be returned in the final submission. |
enum MatchType |
Matching types:
ErrorCode DestroyIndex | ( | ) |
Called only once at the end of the whole test.
Can be used for releasing all memory used to index active queries. The time spent in this function will not affect the score of the submission.
Remove a query from the active query set.
[in] | query_id | The integral ID of the query. This function will not be called twice with the same query ID. |
Return the next available active queries subset that matches any previously submitted document, sorted by query IDs.
The returned result must depend on the state of the active queries at the time of calling MatchDocument().
[out] | *p_doc_id | A document ID that has not been returned before. You can choose to return the results of any document that has not been returned before. |
[out] | *p_num_res | The number of active queries that matched the document *p_doc_id. |
[out] | *p_query_ids | An array of the IDs of the *p_num_res matching queries, ordered by the ID values. This array must be allocated by this core library using malloc(). This array must not be freed by the core library, since it will be freed by the testing benchmark. If *p_num_res=0, this array must not be allocated, as it will not be freed by the testing benchmark in that case. Allocating this array using "new" is not acceptable. In case of *p_num_res is not zero, The size of this array must be equal to "(*p_num_res)*sizeof(QueryID)" bytes. |
ErrorCode InitializeIndex | ( | ) |
Called only once at the beginning of the whole test.
Performs any required initializations.
Push a document to the server.
[in] | doc_id | The integral ID of the document. This function will not be called twice with the same document ID. |
[in] | doc_str | A null-terminated string representing the document. It consists of a space separated set of words. The length of any word will be at least MIN_WORD_LENGTH characters and at most MAX_WORD_LENGTH characters. The length of any document will not exceed MAX_DOC_LENGTH characters. "doc_str" contains at least one non-space character. "doc_str" contains only lower case letters from 'a' to 'z' and space characters. |
ErrorCode StartQuery | ( | QueryID | query_id, |
const char * | query_str, | ||
MatchType | match_type, | ||
unsigned int | match_dist ) |
Add a query (associated with matching type) to the active query set.
[in] | query_id | The integral ID of the query. This function will not be called twice with the same query ID. |
[in] | query_str | A null-terminated string representing the query. It consists of a space separated set of words. The length of any word will be at least MIN_WORD_LENGTH characters and at most MAX_WORD_LENGTH characters. The number of words in a query will not exceed MAX_QUERY_WORDS words. "query_str" contains at least one non-space character. "query_str" contains only lower case letters from 'a' to 'z' and space characters. |
[in] | match_type | The type of mechanism used to consider a query as a match to any document, as specified in MatchType enumeration. |
[in] | match_dist | The hamming or edit distance (according to "match_type") threshold used as explained in MatchType enumeration. This parameter must be equal 0 for exact matching. The possible values of this parameter are 0,1,2,3. A query matches a document if and only if: for each word in the query, there exist a word in the document that matches it under the "match_type" and "match_dist" constraints. Note that the "match_dist" constraint is applied independently for each word in the query. |