A New Chapter in Blogging: Exploring the World of Agents
After a decade-long hiatus, I am thrilled to announce my return to blogging! This new journey will center around the fascinating and ever-evolving domain of Agents, with a particular focus on Software Engineering Agents (SWE-Agents).Through this blog, I aim to share insights, ideas, and developments in this exciting field. My goal is to spark thought-provoking discussions and provide content that is both insightful and valuable to readers. Your feedback and perspectives will be invaluable, so I warmly invite you to share your thoughts in the comments and join the conversation.
-
Do SWE-Agents Solve Multi-File Issues Like Humans? A Deep Dive into SWE-Bench Verified
The year 2024 was remarkable for SWE-Agents, as we celebrated significant advancements in system performance on our cherished SWE-Bench benchmark. This progress was especially notable on the SWE-Bench Verified benchmark since its release. -
OpenHands CodeAct v2.1 v/s Tools + Claude 3.5 Sonnet
The recent release of the Claude-3.5 Sonnet (20241022) model has been a game-changer, revitalizing the SWE-Bench leaderboard. This model has been the driving force behind several systems currently holding the top positions. -
SWE-Bench Verified ⊊ real-world SWE tasks
The title “SWE-Bench Verified ⊊ real-world SWE tasks”, conveys 2 points: SWE-Bench Verified != real-world SWE tasks, and SWE-Bench Verified ⊆ real-world SWE tasks (i.e. a subset)
Posts from my blog’s previous iteration are available below:
-
Installing Octave on OS X 10.9 Mavericks
If you upgraded to OS X 10.9 Mavericks and have started to like the latest enhancements in Finder, new iBooks and Maps applications etc., installing Octave is going to make you forget all of it. The installation process is a herculean task and you better pray to God that you... -
Comparison is always false due to limited range of data type
This is a small and crisp post over this warning message that appeared quite strange at first. On a closer look though, the reason behind it was pretty clear and straight-forward. -
Keyboard Review - Microsoft Natural Ergonomic Keyboard 4000
As a software developer, I spend about 6-8 hours typing on a typical day. My search for a better keyboard began, when I started having wrist pain and my fingers felt enervated after a few hours. On doing some research, I found that these are the symptoms of an Repetitive... -
Concurrent and Sequential statements in Verilog
This is a brief post for beginners to Verilog language, coming from a C/Java background. Verilog differs from a conventional programming language in the sense that the execution of statements is not strictly sequential. Different code blocks are executed concurrently as opposed to the sequential execution of most programming languages.... -
C++ - Variable Declaration in 'if' expression
Recently, I encountered a strange compiler error at work. I was trying something similar to this: if( (int var1 = func1() ) && ( int var2 = func2() ) ) { // Use var1 and var2 here - Doesn't compile } but, it didn’t compile. So, I played my Jedi... -
Forward Class Declaration in C++
“In computer programming, a forward declaration is a declaration of an identifier (denoting an entity such as a type, a variable, or a function) for which the programmer has not yet given a complete definition.” -
Why use GIT and hang CVS?
I recently delivered a presentation on ** Why use GIT and leave CVS **. It covers various aspects where CVS is a total failure, thus stressing upon the need to change. -
Integer Limits and Types In C/C++
Unlike Java or C#, primitive data types in C++ can vary in size depending on the platform. For example, int is not guaranteed to be a 32-bit integer. The size of basic C++ types depends on