Find Jobs
Hire Freelancers

Generate term-docId pairs with python code

₹600-1500 INR

Lezárt
Kiadva ekkor: 6 hónappal ezelőtt

₹600-1500 INR

Teljesítéskor fizetve
I need a freelancer to generate term-docId pairs using Python code. Input Data: - The input data is in the format of sgm and txt files. Code: - I do not have a Python code to generate the term-docId pairs, but I have an existing code for a similar task that can be used as a reference. Deadline: - The preferred deadline for the project is 2 days. Skills and Experience: - Proficiency in Python programming - Experience in working with different file formats (sgm and txt) - Familiarity with generating term-docId pairs - Ability to meet tight deadlines Regarding Project:: Sub Project 1: Use reuters, the module that while there are still more documents to be processed, accepts a document as a list of tokens (omit punctuation) and outputs term-docID pairs. Instead of appending new term-docID pairings to a list, make sure you directly append the docID to the postings list for the term. You may use a hash table. No boxes required. (a) compare timing of this SPIMI inspired procedure with the naive indexer (for 10000 term-docID pairings). (b) compile an inverted index for Reuters21578 without using any compression techniques docID hint: Use the NEWID values from the Reuters corpus to make your retrieval comparable. Subproject II: Convert your indexer into a probabilistic search engine 1. using the assumptions made about independence of terms and documents etc. and 2. using the BM25 formula 3. rank the documents your index returns and 4. for a given query, return a ranked list of results. Notes: experiment with different values for the parameters k1 and b. Test queries: 1. design four test queries: (a) a single keyword query. Compare results for the same queries of Subproject I with the results for your Nave indexer (b) a multiple keyword query for Subproject I returning documents containing all the keywords (AND) for unranked retrieval (c) a multiple keywords query returning documents containing at least one keyword (OR), where documents are ordered by how many keywords they contain) (d) a query consisting of several keywords for ranking with BM25 2. run your four test queries to showcase your code and comment on the results in your report Deliverables: well documented sample runs for your queries on the information needs: (a) Democrats’ welfare and healthcare reform policies (b) Drug company bankruptcies (c) George Bush 4. any additional testing or aborted design ideas that show off particular aspects of your project.
Projektazonosító: 37395068

A projektről

3 ajánlat
Távolról teljesíthető projekt
Aktiválva: 5 hónappal ezelőtt

Szeretne pénzt keresni?

A Freelancer oldalán történő árajánlatadás előnyei

Határozzon meg költségvetést és időkeretet
Kapja meg fizetését a munkáért
Vázolja ajánlatát
Ingyen regisztrálhat és adhat árajánlatot munkákra
3 szabadúszó adott átlagosan ₹2 183 INR összegű árajánlatot erre a munkára
Felhasználó avatár
I am an experienced IT professional and a Data Science practitioner. Your job caught my eye and looks to be quite interesting to me as I did similar work in recent past. I have developed various simple to complex algorithms pertaining to ML/DL/NLP/Computer Vision from exploratory data analysis (EDA) to model building till deployment. I have good hands-on experience in data engineering and product development including conversational AI chatbot for industrial use cases. I am well conversant with Generative AI and hands-on experience in developing AI applications using LangChain. I am confident that I will be able to help you by developing python code for term-docId pairs catering to your requirement. Notable Projects successfully completed: - Conversational AI Chatbot (Rasa/WhatsApp) - PII redaction - Recommendation engine - Unsupervised preventive maintenance model - Ground water quality prediction - Topic modeling - Text classification - Named entity recognition - Image search - OCR image recognition and text extraction Relevant Skills: - Python - AWS - LangChain - Amazon Comprehend - TensorFlow - Google Colab - OpenCV - Tableau Let's have a chat to understand the project objective and dataset in details. I am sure I can be the best fit for your project. I assure you the best quality results and ensure the customer satisfaction. Looking forward to hearing from you soon. Thanks for the opportunity.
₹3 500 INR 5 napon belül
5,0 (15 értékelés)
6,0
6,0
Felhasználó avatár
Hi,i can deliver you the work within hours as I have a strong background in Python programming which can be used to generate term-docId pairs using Python code. I understand the need for generating term-docId pairs with Python code as it helps to reduce the workload of the user. Additionally, my experience in working with different file formats (sgm and txt) makes me well-versed in handling the project requirements. I am also confident that I can meet your deadline as I have a track record of delivering quality work within the stipulated timeframe. If you choose me for this project, you can rest assured that you will not be disappointed with my work. Please feel free to contact me if you would like to discuss further or answer any questions regarding my candidacy for this job.
₹2 000 INR 4 napon belül
4,9 (8 értékelés)
3,5
3,5
Felhasználó avatár
I can complete this in 7days. I have 29k followers on instagram and recently started youtube channel and i edit vedios and reels and photo editings and i have completed 30 online courses in computer and im an bsc computer student and got certificate in editing and python and i know other programming languages
₹1 050 INR 7 napon belül
0,0 (0 értékelés)
0,0
0,0

Az ügyfélről

INDIA zászlója
DEVARAPALLI, India
0,0
0
Tagság kezdete: febr. 2, 2020

Ügyfél-hitelesítés

Köszönjük! E-mailben elküldtük a linket, melyen átveheti ajándék egyenlegét.
E-mailje elküldése során valami hiba történt. Kérjük, próbálja újra.
Regisztrált Felhasználók Összes Közzétett Munka
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Előnézet betöltése
Hozzáférést adott a helymeghatározáshoz.
Belépési munkamenete lejárt, és kijelentkeztettük. Kérjük, lépjen be újra.