The list of informal, weird AI benchmarks keeps growing. Over the past few days, some in the AI community on X have become obsessed with a test of how different AI models, particularly so-called reasoning models, handle prompts like this: “Write a Python script for a bouncing yellow ball within a shape. Make the shape… Continua a leggere People are benchmarking AI by having it make balls bounce in rotating shapes