CSE 333 Section 4 - Courses.cs.washington.edu

8d ago
14 Views
0 Downloads
2.67 MB
23 Pages
Last View : 1d ago
Last Download : n/a
Upload by : Tia Newell
Transcription

CSE 333Section 4HW2 Overview, C Intro

Logistics Homework 2 Due next Thursday (4/28) @ 11:00pm Indexing files to allow for searching Exercise 9 Out on Friday Deals with classes in C Due Monday (4/25) @ 10:00am

Homework 2 Overview

Homework 2 Main Idea: Build a search engine for a file system It can take in queries and output a list of files in a directory that has that queryThe query will be ordered based on the number of times the query is in that fileShould handle multiple word queries (Note: all words in a query have to be in the file) What does this mean? Part A: Parsing a file and reading all of its contents into heap allocated memoryPart B: Crawling a directory (reading all regular files recursively in a directory) andbuilding an index to query fromPart C: Build a searchshell (search engine) to query your index for resultsNote: It will use the LinkedList andHashTable implementations from HW1!

somefile.txtPart A: File ParsingRead a file and generate a HashTableof WordPositions!Word positions will include the wordand LinkedList of its positions in a file.typedef struct WordPositions {char*word;// normalized word.Owned.LinkedList*positions; // list of DocPositionOffset t.} WordPositions;Note that the key is the hashed C-string of WordPositionsMy goodness! I love the course CSE333.\nI'll recommend this course to my friends.\nParseIntoWordPositionsTable(contents)

Part B: Directory Crawling – DocTableRead through a directory in CrawlFileTree.cFor each file visited, build your DocTable andMemIndex!DocTable maps document names to IDs. FNV64 isa hash function.struct doctable st {HashTable *id to name;// mapping doc id to doc nameHashTable *name to id;// mapping docname to doc idDocID t// max docID allocated so farmax id;};DocID t DocTable Add(DocTable *table, char *doc name);

Part B: Directory Crawling – MemIndexMemIndex is an index to view files.It’s a HashTable of WordPostings.typedef struct {char*word;HashTable*postings;} WordPostings;Let’s try to find what contains “course”: WordPostings’ postings has anelement with key 3 (OnlyDocID 3 has “course in its file”)The value is the LinkedList of offsetsthe words are in DocID 3DocID onOffset t

Part C: Searchshell Use queries to ask for a result! Formatting should match example outputExact implementation is up to you!MemIndex.htypedef struct SearchResult {uint64 t docid;// a document that matches a search queryuint32 t rank;// an indicator of the quality of the match} SearchResult, *SearchResultPtr;Querycourse friends myMemIndex Search(MemIndex,QueryArray, QueryLen);Results from Query!

Hints Read the .h files for documentation about functions! Understand the high level idea and data structures before getting started Follow the suggested implementation steps given in the CSE 333 HW2 spec

Pointers, References, & Const

Examplex, x ref55Consider the following code:int x 5;int& x ref x;int* x ptr &x;Note syntactic similarity to pointerdeclarationx ptr0x7fff.Still the address-of operator!What are some tradeoffs to using pointers vs references?

Pointers vs. ReferencesPointersReferences Can move to different data viareassignment/pointer arithmetic References the same data for itsentire lifetime - can’t reassign Can be initialized to NULL No sensible “default reference,”must be an alias Useful for output parameters:MyClass* output Useful for input parameters:const MyClass &input

Pointers, References, Parameters void func(int& arg) vs. void func(int* arg) Use references when you don’t want to deal with pointer semantics Allows real pass-by-reference Can make intentions clearer in some cases STYLE TIP: use references for input parameters and pointers for outputparameters, with the output parameters declared last Note: A reference can’t be NULL

Const Mark a variable with const to make acompile time check that a variable isnever reassigned Does not change the underlying writepermissions for this variable0x7fff.ro x ptr42xint x 42;// Read onlyconst int* ro x ptr &x;// Can still modify x withrw x ptr!int* rw x ptr &x;LegendRed can’t change box it’s next toBlack read and write// Only ever points to xint* const x ptr &x;0x7fff.rw x ptr0x7fff.x ptr

Exercise 1

x, x refro x refExercise 1int x 5;ro ptr10x7fff.50x7fff.x ptrint& x ref x;int* x ptr &x;0x7fff.const int& ro x ref x;ro ptr2const int* ro ptr1 &x;int* const ro ptr2 &x;“Const pointer to an int”“Pointer to a const int”Tip: Read the declaration “right-to-left”LegendRed can’t change box it’snext toBlack read and write

Exercise 1LegendRed can’t change box it’s next toBlack “read and write”ro ptr1void foo(const int& arg);void bar(int& arg);int x 5;int& x refint* x ptrconst int&const int*int* constx, x refro x ref50x7fff.0x7fff.ro ptr2x ptr0x7fff.Which lines result in a compiler error? x; &x;ro x ref x;ro ptr1 &x;ro ptr2 &x; OK ERROR bar(x ref);bar(ro x ref); ro x ref is constfoo(x ref);ro ptr1 (int*) 0xDEADBEEF;x ptr &ro x ref;ro x ref is constro ptr2 ro ptr2 2; ro ptr2 is const*ro ptr1 *ro ptr1 1; (*ro ptr1) isconst

Objects and const Methods

#ifndef POINT H#define POINT HCannot mutate theobject it’s called on.class Point {public:Trying to change x orPoint(const int x, const int y);y inside will throw aint get x() const { return x ; }compiler error!int get y() const { return y ; }double Distance(const Point& p) const;void SetLocation(const int& x, const int& y);private:int x ;int y ;}; // class Point#endif// POINT HA const class object can only callmember functions that have beendeclared as const

Exercise 2

Exercise 3aWhich lines of the snippets of code belowwould cause compiler errors? OK int z 5;const int* x &z;int* y &z;x y;*x *y; int z 5;int* const w &z;const int* const v &z;*v *w;*w *v; ERROR

Exercise 3a cont.Which lines of thesnippets of code belowwould cause compilererrors? OK ERRORclass MultChoice {public:MultChoice(int q, char resp) : q (q), resp (resp) { } // 2-arg ctorint get q() const { return q ; }char get resp() { return resp ; }bool Compare(MultChoice &mc) const; // do these MultChoice's match?private:int q ;// question numberchar resp ; // response: 'A','B','C','D', or 'E'}; // class MultChoiceconst MultChoice m1(1,'A');MultChoice m2(2,'B');cout m1.get resp();cout m2.get q(); const MultChoice m1(1,'A');MultChoice m2(2,'B');m1.Compare(m2);m2.Compare(m1);

What would you change about theclass declaration to make it better?class MultChoice {public:MultChoice(int q, char resp) : q (q), resp (resp) { } // 2-arg ctorint get q() const { return q ; }char get resp() { return resp ; }bool Compare(MultChoice &mc) const; // do these MultChoice's match?private:int q ;// question numberchar resp ; // response: 'A','B','C','D', or 'E'}; // class MultChoice

Deals with classes in C Due Monday (4/25) @ 10:00am. Homework 2 Overview. Homework 2 Main Idea: Build a search engine for a file system It can take in queriesand output a list of filesin a directory that has that query The query will be orderedbased on the number of times the query is in that file