COMS 601: Research Topics -- Computational Linguistics
Fall 2004
Syllabus

Course description: This graduate level research topics course will explore the field of Computational Linguistics.   We will build the framework for addressing the following questions:
        What is important about language?
        What information do we use to understand each other?
        How can we quantify what we know about language so that computers can take advantage of it?
Tools will include finite state machines, language model toolkits, parsers and taggers.  Students will complete a research project with written and oral reports.

Pre-requisites: Graduate standing or permission of the instructor is necessary to register for this class.

Professor: Rebecca Bates
Computer and Information Sciences
Wissink Hall 243
Phone: 507-389-5587
Fax: 507-389-6376
Email: bates@mnsu.edu  

Course Website
http://bates.cs.mnsu.edu/coms601
Check the website regularly for announcements and updates. 

Course Hours and Location
Lecture: TR 2-3:30pm WH 284

Office Hours
Monday Tuesday Wednesday Thursday Friday
** 12-1pm 1-5pm 12-1pm 10-12noon
If things that are useful for the entire class come up, they will be posted on the announcement section of the class webpage so please check it regularly.
**Monday is my research day.  Office hours from 4-6pm will be available by appointment only.

Course Materials
Highly-Recommended Text: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, Daniel Jurafsky and James H. Martin, Prentice Hall, 2000.

Assignments will come from the text book so it will be worth buying.  Additional reading material will be provided or linked to on the website.

Course Goals
This course present an overview of the field of computational linguistics.  Students will have the opportunity to apply programming and algorithm skills to problems within the field of speech and language.  Software tools used in the field will be used for course projects.  All students will have experience using available tools through course assignments.  Research projects will allow for in-depth exploration of a problem, a tool or set of tools, or a corpus.

Class Schedule
This schedule is subject to change. It generally follows the course text.  Specific assignments and objectives will be added to the course web page.  

  1. Ch. 1: Introduction
  2. Ch. 2: Words: Regular Expression and Automata
  3. Ch. 3: Morphology and Finite-State Transducers
  4. Overview of ATT FSM software, and speech recognition (See Ch. 5: Probabilistic Models of Pronunciation and Spelling and 6: N-grams)
  5. Ch. 8: Syntax: Word Classes and Part-of-speech Tagging
  6. Ch. 9: Context-Free Grammars for English
  7. Ch. 10: Parsing with Context-Free Grammars
  8. Ch. 14: Semantics: Representing Meaning
  9. Ch. 16: Lexical Semantics
  10. Ch. 18: Pragmatics: Discourse
  11. Ch. 19: Dialogue and Conversational Agents
  12. Ch. 21: Machine Translation

Chapters not explored in class are also potential sources of project topics.

Course Tools

Grading
Homework, programming assignments, in-class work: 30%
Mid-term Exam: 25%
Final Project: 45% (including written and oral presentations)

Homework and Exams
The homework for this course will include problems from the book as well as programming projects.  Homeworks will be due at the beginning of class.  Electronic submissions for programming work will be accepted but not for written work.

Your exam will be based on information gained through homework and programming projects as well as material covered in lectures and the book. 

Grading Policy

The course grade will be assigned based on the above grading distribution.  In general, given the point totals, letter grades for the course will ultimately correspond to the following rubric.  Since this is a graduate level class, my expectation is that the quality of work will meet A and B levels.

A -- level work
EXCELLENT

 

 

 

  1. Responds fully to what the assignment asks;
  2. demonstrates good critical thinking;
  3. invokes and uses disciplinary terminology correctly;
  4. provides thorough and detailed arguments;
  5. is well-organized and unified;
  6. uses appropriate direct language;
  7. is free of errors in syntax, grammar, punctuation, word choice, spelling, and format;
  8. correctly cites and documents sources (when applicable).
B-level work
VERY GOOD
Realizes high quality in most, but not all of the elements above.
C-level work
ADEQUATE
Realizes adequacy in most of the elements. "C" work might be well-written, but that does not compensate for poor or careless approaches to problem solving.  A C requires competency in programming.
D-level work
WEAK
Fails to realize some elements adequately and contains several relatively serious errors, flaws, or omissions, or a number of minor ones. 
F-level work
POOR
Fails to realize several elements adequately and contains many serious errors, flaws, or omissions, as well as many minor ones.

Expectations of Students

Disabilities
Students who may need accommodations for a disability can make an appointment to see me during my office hours to discuss your needs.

Academic Honesty
By staying enrolled in this class, you agree to abide by the University's Policy for Academic Honesty which appears in the Student Handbook under the section heading "Academic Honesty". If you have questions about the policy please contact me, your advisor, or another faculty member PRIOR to engaging in a "dishonest" act. Failure to abide and respect the Academic Honesty Policy will result in severe penalties as allowed by the University.  I want to point out to you the following expectation, which comes directly from the University's Statement of Student Responsibilities:

In order for an academic community to teach and support appropriate educational values, an environment of trust, cooperation and personal responsibility must be maintained. As members of this University community, students assume the responsibility to fulfill their academic obligations in a fair and honest manner. This responsibility includes avoiding such inappropriate activities as plagiarism, cheating or collusion. Students found responsible for one or more of these activities may face both academic sanctions (such as lowering a grade, failing of a course, etc.) and disciplinary sanctions (such as probation, suspension, expulsion).

It is the intent of Minnesota State University, Mankato to encourage a sense of integrity on the part of students in fulfilling their academic requirements. To give students a better understanding of behaviors that may constitute academic dishonesty, the following definitions are provided:

Plagiarism – Submission of an academic assignment as one's own work, which includes critical ideas or written narrative that are taken from another author without the proper citation. This does not apply only to direct quotes, but also to critical ideas that are paraphrased by the student.

Plagiarism includes but is not limited to:

Cheating — Use of unauthorized material or assistance to help fulfill academic assignments. This material could include unauthorized copies of test materials, calculators, crib sheets, help from another student, etc.

Collusion — Assistance to another student or among students in committing the act of cheating or plagiarism.