
A cross-language text analysis tool implementing word probability calculation with stopword filtering in Java, Python, and JavaScript for comparative study.
Understanding how different programming languages handle the same algorithm reveals their strengths, weaknesses, and idiomatic patterns. Text analysis with word probability calculation is an ideal benchmark because it involves file I/O, string manipulation, data structures, and mathematical computation — areas where languages diverge significantly.
I implemented identical functionality in Java, Python, and JavaScript: reading multiple text files, filtering 850+ stopwords, computing word frequencies, and ranking the top 5 words by probability. Each implementation follows its language's idiomatic patterns rather than being a direct translation, revealing how language design influences code structure.
Python uses NLTK for tokenization with dictionary comprehensions for frequency counting. Java uses HashMap/HashSet with iterative processing and Stream API for sorting. JavaScript uses a functional pipeline with Array.map, filter, and reduce. All three share the same stopword list and input files for consistent comparison. The probability calculation normalizes word counts against total non-stopword tokens.
Completed a side-by-side comparison of three language implementations producing identical results. The exercise demonstrated that Python excels for rapid NLP prototyping, Java provides the most robust error handling through static typing, and JavaScript offers the most concise code through functional patterns. This comparative understanding informs my language selection decisions for new projects.

A deep learning comparative study using Simple NN, CNN, and Residual CNN architectures to classify chest X-rays as Normal or Pneumonia with TensorFlow and Keras.

A Python Pygame 2D action RPG inspired by Legend of Zelda, featuring real-time combat with 5 weapons, magic spells, 4 enemy types, and character progression.

An information retrieval system that recommends anime and manga using TF-IDF vector similarity, query spell correction, inverted indices, and user feedback refinement.