A research paper, which I co-authored, was published today in the IEEE Access journal. In this work we presented a method of clustering of software repositories and its application to 1659 GitHub repositories. Long story short, it seems to be possible, by just looking at 28 software metrics, to tell which category/cluster this repository belongs to, for example: in active development, sudden peak of activity, the disillusionment stage, and so on. We intent to use the result of this research in solving the problem of automation of software engineering (robots that help programmers).
Новая лекция из курса о качестве программных проектов, который я заканчиваю читать в ВШЭ. Постарался объяснить, в чем отличие NFR от функциональных требований, и предложить методы их тестирования.
Just published the third lecture (80 minutes) of the "Practical Program Analysis" course, about contextual analysis: how to turn a concrete syntax tree into an abstract syntax tree and what is the purpose of such a transition.
Media is too big
VIEW IN TELEGRAM
I didn't recommend you good movies for a while. Here is a very good one: A Serious Man (2009) by Coen Brothers. It's that kind of a movie: the more you watch it the better it becomes.
The 4th lecture (80 minutes) from the "Practical Program Analysis" course is published. It's about formal semantics: operational and denotational. I tried to explain how it's possible to formalize a programming language.
Еще одна лекция из курса о качестве software projects, для студентов ВШЭ, в этот раз о тестировании "второго порядка" — тестах, которые тестируют тесты.
The fifth lecture just published (81 minutes) of the PPA course (Innopolis University), about abstract machines and their applicability to the analysis of programs.
Make a guess, how many lines of C++ code constitute HotSpot (a Java virtual machine that is being developed by Oracle for over 20 years)?
Anonymous Poll
16%
85,000
20%
850,000
38%
8,500,000
26%
85,000,000
If you are a tech startup founder based in Russia, I can connect you with people in Fortune-100 tech giant who make seed investments (up to $5M) in Russia. Text me.
PS. Zero-revenue startups are very welcome.
PS. Zero-revenue startups are very welcome.
After asking ChatGPT multiple times "what are the features available in object-oriented programming languages" I got the following list. What do we still miss?
"Polymorphism, Nested Objects, Traits, Templates, Generics, Invariants, Classes, NULL, Pointers, Goto, Error Handling, Exception Handling, Operators, Loops, Methods, Static Blocks, Virtual Tables, Coroutines, Threads, Monads, Algebraic Types, References, Variables, Function Inlining, Type Checking, Annotations, Arrays, Mutability, Interfaces, Constructors, Destructors, Lifetimes, Volatile Variables, Synchronization, Macros, Inheritance, Overloading, Tuple Types, Closures, Higher-Order Functions, Access Modifiers, Pattern Matching, Enumerated Types, Namespaces, Modules, Type Aliases, Decorators, Lambda Functions, Type Inference, Properties, Value Types, Multiple Inheritance, Structural Typing, Events, Callbacks, Dynamic Typing, NULL Safety, Streams, Buffers, Iterators, Generators, Metaprogramming, Aspects, Anonymous Objects, Anonymous Functions, Reflection, Type Casting, Lazy Evaluation, Garbage Collection, Immutability, Recursion, Concurrency Control, Context Management, Memory Management, Logging, Breakpoints, Assertions, Caching."
"Polymorphism, Nested Objects, Traits, Templates, Generics, Invariants, Classes, NULL, Pointers, Goto, Error Handling, Exception Handling, Operators, Loops, Methods, Static Blocks, Virtual Tables, Coroutines, Threads, Monads, Algebraic Types, References, Variables, Function Inlining, Type Checking, Annotations, Arrays, Mutability, Interfaces, Constructors, Destructors, Lifetimes, Volatile Variables, Synchronization, Macros, Inheritance, Overloading, Tuple Types, Closures, Higher-Order Functions, Access Modifiers, Pattern Matching, Enumerated Types, Namespaces, Modules, Type Aliases, Decorators, Lambda Functions, Type Inference, Properties, Value Types, Multiple Inheritance, Structural Typing, Events, Callbacks, Dynamic Typing, NULL Safety, Streams, Buffers, Iterators, Generators, Metaprogramming, Aspects, Anonymous Objects, Anonymous Functions, Reflection, Type Casting, Lazy Evaluation, Garbage Collection, Immutability, Recursion, Concurrency Control, Context Management, Memory Management, Logging, Breakpoints, Assertions, Caching."
In two weeks, on the 22nd of April, at 17:00 (Moscow time) we will live-stream ICCQ'23 conference. It will start with a keynote speech of Prof. David West (the author of "Object Thinking" book) and then followed by the presentations of four accepted research papers. The conference is organised in cooperation with IEEE Computer Society and hosted by St. Petersburg State University.
The sixth lecture (80 minutes) of the PPA course was just published, it is about the ingredients of program analysis, such as metrics of it, basic principles, definitions, etc. The next three lectures will be about data flow analysis, symbolic execution, and model checking.
The seventh lecture (79 minutes) in the "Practical Program Analysis" series just published, about data flow analysis, with a few practical examples. I tried to explain the subject in as simpler terms as possible.
Опубликовал последнюю 16-ю лекцию (80 минут) из курса "Управление качеством в программных проектах" для студентов четвертого курса Высшей Школы Экономики (ВШЭ, Москва): попытался рассказать, какими признаками должен обладать репозиторий с исходным кодом и тикетами, чтобы его можно было считать качественно организованным и контролируемым.
I created a small Rust library called micromap, which implements a traditional hash map, but without the hash and without the usage of heap. Thanks to this, it appears to be five times (!) faster than the standard HashMap. The work is not finished though, since not all functions of a map are implemented. Besides, I'm not a Rust expert at all, that's why the library most certainly has mistakes. You are very welcome to contribute. The list of unsolved issues is here.
Maybe some of you've heard about Puzzle Driven Development (PDD), a methodology I suggested about twelve years ago: it helps decompose larger programming tasks into smaller pieces. Since then, PDD has been actively used in many GitHub repositories, through 0pdd bot (if you still don't use it, try now!). Last year I participated in a research project, which has made an attempt to extend PDD with Machine Learning, letting AI make puzzle assignment decisions. Just a few days ago a research paper that I co-authored has been published: "Prioritizing Tasks in Software Development: A Systematic Literature Review". As its name hints, it overviews existing decision making methods, highlighting their strengths and weaknesses. BTW, a bit earlier we published a short paper about the research itself: "Automatically Prioritizing and Assigning Tasks from Code Repositories in Puzzle Driven Development". More papers coming...
The eighth lecture (77 minutes) in the "Practical Program Analysis" series just published, about symbolic execution, and also about test case generation and concolic testing. It is a high level overview, which may help you understand the topic before diving deeper.
The ninth lecture (76 minutes) is published, it is about Model Checking. Watch it and get ready for the last one, where we will discuss the application of Machine Learning (a.k.a. Artificial Intelligence) to program analysis and the future of analysis.