Fossil fuels produce NO2, which is linked to asthma attacks, bronchitis, and higher risks of heart disease and stroke, according the EV news site Electrek. But the nonprofit news site Grist.org notes a new analysis showing that those emissions decreased by 1.1% for every increase of 200 electric vehicles - across nearly 1,700 ZIP codes. "A pretty small addition of cars at the ZIP code level led to a decline in air pollution," said Sandrah Eckel, a public health professor at the University of Southern California's Keck School of Medicine and lead author of the study. "It's remarkable."
The study was done at the University of Southern California's medical school, by researchers using high-resolution satellite data, reports Electrek:
The study, just published in The Lancet Planetary Health and partly funded by the National Institutes of Health, adds rare real-world evidence to a claim that's often taken for granted - that EVs don't just cut carbon over time, they also improve local air quality right now... The researchers ran multiple checks to make sure the trend wasn't driven by unrelated factors. They accounted for pandemic-era changes by excluding 2020 in some analyses and controlling for gas prices and work-from-home patterns. They also saw the expected counterexample: neighborhoods that added more gas-powered vehicles experienced increases in pollution. The findings were then replicated using updated ground-level air monitoring data dating back to 2012...
Next, the researchers plan to compare EV adoption with asthma-related emergency room visits and hospitalizations. If those trends line up, it could provide some of the clearest evidence yet of what we already know: that electrifying transportation doesn't just clean the air on paper; it improves public health in practice.
Thanks to long-time Slashdot reader jhoegl for sharing the article.
Long-time Slashdot reader fahrbot-bot writes: Researchers have developed a robotic hand that can not only skitter about on its fingertips, it can also bend its fingers backward, connect and disconnect from a robotic arm, and pick up and carry one or more objects at a time.
This article in Science News includes footage of the robotic arm reattaching itself to the skittering robot hand, which can also hold objects against both sides of its palm simultaneously, and "can even unscrew the cap off a mustard bottle while holding the bottle in place."
With its unusual agility, it could navigate and retrieve objects in spaces too confined for human hands. When attached to the mechanical arm, the robotic hand could pick up objects much like a human hand. The bot pinched a ball between two fingers, wrapped four fingers around a metal rod and held a flat disc between fingers and palm.
But the bot isn't constrained by human anatomy... When the robot was separated from the arm, it was most stable walking on four or five fingers and using one or two fingers for grabbing and carrying things, the team found. In one set of trials with both bots, the hand detached from the robotic arm and used its fingers as legs to skitter over to a wooden block. Once there, it picked up the block with one finger and carried it back to the arm.
The crawling bot could one day aid in industrial inspections of pipes and equipment too small for a human or larger robot to access, says Xiao Gao, a roboticist now at Wuhan University in China. It might retrieve objects in a warehouse or navigate confined spaces in disaster response efforts.
Imagine a 280-unit apartment complex offering no on-site leasing office with a human agent for questions. "Instead, the entire process has been outsourced to AI..." reports SFGate, "from touring to signing the lease to completing management tasks once you actually move in."
Now imagine it's far more than just one apartment complex...
At two other Jack London Square apartment buildings, my initial interactions were also with a robot. At the Allegro, my fiance and I entered the leasing office for our tour and asked for "Grace P," the leasing agent who had emailed us. "Oh, that's just our AI assistant," the woman at the front desk told us... At Aqua Via, another towering apartment complex across the street, I emailed back and forth with a very helpful and polite "Sofia M." My pal Sofia seemed so human-like in her responses that I did not realize she was AI until I looked a little closer at a text she'd sent me. "Msgs may be AI or human generated...." [S]he continued to text me for weeks after I'd moved on, trying to win me back. When I looked at the fine print, I realized both of these complexes were using EliseAI, a leading AI housing startup that claims to be involved in managing 1 in 6 apartments in the U.S...
[50 corporate landlords have funded a VC named RET Ventures to invest in and deploy rental-automating AI, and SFGate's reporter spoke to partner Christopher Yip.] According to Yip, AI is common in large apartment complexes not just in the tech-centric Bay Area, but across the entire country. It all kicked off at the onset of the COVID-19 pandemic in 2020, he said, when contactless, self-guided apartment tours and completely virtual tours where people rented apartments sight unseen became commonplace. Technology's infiltration into the renting process has only grown deeper in the years since, Yip said, mirroring how pervasive AI has become in many other facets of our lives. "From an industry perspective, it's really about meeting the renter where they are," Yip said. He pointed to how many renters now prefer to interact through text and email, and want to tour apartments at their convenience - say, at 7 p.m. after work, when a typical leasing office might be closed.
The latest updates in technology not only allow you to take a self-guided tour with AI unlocking the door for you, but also to ask AI questions by conversing with voice AI as you wander through the kitchen and bedroom at your leisure. And while a human leasing agent might ghost you for days or weeks at a time, AI responds almost instantly - EliseAI typically responds within 30 seconds, [said Fran Loftus, chief experience officer at EliseAI]... [I]n some scenarios, the goal does seem to be to eliminate humans entirely. "We do have long-term plans of building fully autonomous buildings," Loftus said.... "We think there's a time and a place for that, depending on the type of property. But really right now, it's about helping with this crazy turnover in this industry." The reporter says they missed the human touch, since "The second AI was involved, the interaction felt cold. When a human couldn't even be bothered to show up to give me a tour, my trust evaporated."
But they conclude that in the years ahead, human landlords offering tours "will probably go the way of landlines and VCRs."
Programmer/entrepreneur Paul Ford is the co-founder of AI-driven business software platform Aboard. This week he wrote a guest essay for the New York Times titled "The AI Disruption Has Arrived, and It Sure Is Fun," arguing that Anthropic's Claude Code "was always a helpful coding assistant, but in November it suddenly got much better, and ever since I've been knocking off side projects that had sat in folders for a decade or longer... [W]hen the stars align and my prompts work out, I can do hundreds of thousands of dollars worth of work for fun (fun for me) over weekends and evenings, for the price of the Claude $200-a-month."
He elaborates on his point on the Aboard.com blog:
I'm deeply convinced that it's possible to accelerate software development with AI coding - not deprofessionalize it entirely, or simplify it so that everything is prompts, but make it into a more accessible craft. Things which not long ago cost hundreds of thousands of dollars to pull off might come for hundreds of dollars, and be doable by you, or your cousin. This is a remarkable accelerant, dumped into the public square at a bad moment, with no guidance or manual - and the reaction of many people who could gain the most power from these tools is rejection and anxiety. But as I wrote....
I believe there are millions, maybe billions, of software products that don't exist but should: Dashboards, reports, apps, project trackers and countless others. People want these things to do their jobs, or to help others, but they can't find the budget. They make do with spreadsheets and to-do lists.
I don't expect to change any minds; that's not how minds work. I just wanted to make sure that I used the platform offered by the Times to say, in as cheerful a way as possible: Hey, this new power is real, and it should be in as many hands as possible. I believe everyone should have good software, and that it's more possible now than it was a few years ago.
From his guest essay:
Is the software I'm making for myself on my phone as good as handcrafted, bespoke code? No. But it's immediate and cheap. And the quantities, measured in lines of text, are large. It might fail a company's quality test, but it would meet every deadline. That is what makes A.I. coding such a shock to the system... What if software suddenly wanted to ship? What if all of that immense bureaucracy, the endless processes, the mind-boggling range of costs that you need to make the computer compute, just goes?
That doesn't mean that the software will be good. But most software today is not good. It simply means that products could go to market very quickly. And for lots of users, that's going to be fine. People don't judge A.I. code the same way they judge slop articles or glazed videos. They're not looking for the human connection of art. They're looking to achieve a goal. Code just has to work... In about six months you could do a lot of things that took me 20 years to learn. I'm writing all kinds of code I never could before - but you can, too. If we can't stop the freight train, we can at least hop on for a ride.
The simple truth is that I am less valuable than I used to be. It stings to be made obsolete, but it's fun to code on the train, too. And if this technology keeps improving, then all of the people who tell me how hard it is to make a report, place an order, upgrade an app or update a record - they could get the software they deserve, too. That might be a good trade, long term.
The cost of sequencing human genome has fallen from $100M to under $100 in approximately 25 years Reddit
Element Biosciences reportedly hit the $100 genome milestone (Feb 2026).
For context: Human Genome Project (2000) cost ~ $100M and ~$1,000 genome achieved around 2014, it's now under $100 in ~25 years
That’s a 1,000,000x cost reduction, far outpacing Moore’s Law. If this trend continues, personalized genomics becomes mass-market scale. Article + thread below.
Thread and Progress Chart
After 16 Years, 'Interim' CTO Finally Eradicating Fujitsu and Horizon From the UK's Post Office Slashdot
Besides running tech operations at the UK's Post Office, their interim CTO is also removing and replacing Fujitsu's Horizon system, which Computer Weekly describes as "the error-ridden software that a public inquiry linked to 13 people taking their own lives."
After over 16 years of covering the scandal they'd first discovered back in 2009, Computer Weekly now talks to CTO Paul Anastassi about his plans to finally remove every trace of the Horizon system that's been in use at Post Office branches for over 30 years - before the year 2030:
"There are more than 80 components that make up the Horizon platform, and only half of those are managed by Fujitsu," said Anastassi. "The other components are internal and often with other third parties as well," he added... The plan is to introduce a modern front end that is device agnostic. "We want to get away from [the need] to have a certain device on a certain terminal in your branch. We want to provide flexibility around that...."
Anastassi is not the first person to be given the task of terminating Horizon and ending Fujitsu's contract. In 2015, the Post Office began a project to replace Fujitsu and Horizon with IBM and its technology, but after things got complex, Post Office directors went crawling back to Fujitsu. Then, after Horizon was proved in the High Court to be at fault for the account shortfalls that subpostmasters were blamed and punished for, the Post Office knew it had to change the system. This culminated in the New Branch IT (NBIT) project, but this ran into trouble and was eventually axed. This was before Anastassi's time, and before that of its new top team of executives.... Things are finally moving at pace, and by the summer of this year, two separate contracts will be signed with suppliers, signalling the beginning of the final act for Fujitsu and its Horizon system. Anastassi has 30 years of IT management experience, the article points out, and he estimates the project will even bring "a considerable cost saving over what we currently pay for Fujitsu."
America's Peace Corps Announces 'Tech Corps' Volunteers to Help Bring AI to Foreign Countries Slashdot
Over 240,000 Americans volunteered for Peace Corps projects in 142 countries since the program began more than half a century ago.
But now the agency is launching a new initiative - called Tech Corps. "It's the Peace Corps, but make it AI," explains Engadget:
The Peace Corps' latest proposal will recruit STEM graduates or those with professional experience in the artificial intelligence sector and send them to participating host countries.
According to the press release, volunteers will be placed in Peace Corps countries that are part of the American AI Exports Program, which was created last year from an executive order from President Trump as a way to bolster the US' grip on the AI market abroad. Tech Corps members will be tasked with using AI to resolve issues related to agriculture, education, health and economic development. The program will offer its members 12- to 27-month in-person assignments or virtual placements, which will include housing, healthcare, a living stipend and a volunteer service award if the corps member is placed overseas.
"American technology to power prosperity," reads the headline at Tech Corps web site. ("Build the tech nations depend on... See the world. Be the future."
The site says they're recruiting "service-minded technologists to serve in the Peace Corps to help countries around the world harness American AI to enhance opportunity and prosperity for their citizens." (And experienced technology professionals can donate 5-15 hours a week "to mentor and support projects on-the-ground.")
Berlin-based T2 Linux developer René Rebe (long-time Slashdot reader ReneR) is announcing that their Xorg display server has now restored its XAA acceleration architecture, "bringing fixed-function hardware 2D acceleration back to many older graphics cards that upstream left in software-rendered mode."
Older fixed-function GPUs now regain smooth window movement, low CPU usage, and proper 24-bit bpp framebuffer support (also restored in T2). Tested hardware includes ATi Mach-64 and Rage-128, SiS, Trident, Cirrus, Matrox (Millennium/G450), Permedia2, Tseng ET6000 and even the Sun Creator/Elite 3D.
The result: vintage and retro systems and classic high-end Unix workstations that are fast and responsive again.
This week the Python Software Foundation explained how they keep Python secure. A new blog post recognizes the volunteers and paid Python Software Foundation staff on the Python Security Response Team (PSRT), who "triage and coordinate vulnerability reports and remediations keeping all Python users safe."
Just last year the PSRT published 16 vulnerability advisories for CPython and pip, the most in a single year to date! And the PSRT usually can't do this work alone, PSRT coordinators are encouraged to involve maintainers and experts on the projects and submodules. By involving the experts directly in the remediation process ensures fixes adhere to existing API conventions and threat-models, are maintainable long-term, and have minimal impact on existing use-cases. Sometimes the PSRT even coordinates with other open source projects to avoid catching the Python ecosystem off-guard by publishing a vulnerability advisory that affects multiple other projects. The most recent example of this is PyPI's ZIP archive differential attack mitigation.
This work deserves recognition and celebration just like contributions to source code and documentation. [Security Developer-in-Residence Seth Larson and PSF Infrastructure Engineer Jacob Coffee] are developing further improvements to workflows involving "GitHub Security Advisories" to record the reporter, coordinator, and remediation developers and reviewers to CVE and OSV records to properly thank everyone involved in the otherwise private contribution to open source projects.
"Feb 20 (Reuters) - OpenAI is targeting roughly $600 billion in total compute spend through 2030, a source familiar with the matter told Reuters on Friday, as the ChatGPT maker lays groundwork for an IPO that could value it at up to $1 trillion.
OpenAI's 2025 revenue totaled $13 billion, beating its $10 billion projection, while it spent $8 billion during the year, under its $9 billion target, the person said."
Bugs surviving decades of expert review and millions of fuzzing hours just got found by an AI. Claude Code Security emerges.
Researchers at the Thomas Jefferson National Accelerator Facility are advancing Accelerator-Driven Systems (ADS) that use high-energy proton beams to transmute long-lived nuclear waste into shorter-lived isotopes. "The process also generates significant heat, which can be harnessed to produce additional electricity for the grid," reports Interesting Engineering. The projects are supported by $8.17 million in grants from the Department of Energy's NEWTON (Nuclear Energy Waste Transmutation Optimized Now) program. From the report: The researchers are developing ADS technology. This system uses a particle accelerator to fire high-energy protons at a target (such as liquid mercury), triggering a process called "spallation." This releases a flood of neutrons that interact with unwanted, long-lived isotopes in nuclear waste. The technology can effectively "burn" the most hazardous components of the waste by transmuting these elements. While unprocessed fuel remains dangerous for approximately 100,000 years, partitioning and recycling via ADS can reduce that window to just 300 years. [...]
To make ADS economically viability, Jefferson Lab is tackling two primary technical hurdles: efficiency and power. Traditional particle accelerators require massive, expensive cryogenic cooling systems to reach superconducting temperatures. Jefferson Lab is pioneering a more cost-effective approach by coating the interior of pure niobium cavities with tin. These niobium-tin cavities can operate at higher temperatures, allowing for the use of standard commercial cooling units rather than custom, large-scale cryogenic plants. The team is also developing spoke cavities, which is a complex design intended to drive even higher efficiency in neutron spallation.
The second project focuses on the power source behind the beam. Researchers are adapting the magnetron -- the same component that powers microwave ovens -- to provide the 10 megawatts of power required for ADS. The primary challenge is that the energy frequency must match the accelerator cavity precisely at 805 Megahertz. In collaboration with Stellant Systems, researchers are prototyping advanced magnetrons that can be combined to reach the necessary high-power thresholds with maximum efficiency. The NEWTON program aims to enable the recycling of the entire US commercial nuclear fuel stockpile within the next 30 years.
An anonymous reader quotes a report from NPR: NASA could launch four astronauts on a mission to fly around the moon as soon as March 6th. That's the launch date (PDF) that the space agency is now working towards following a successful test fueling of its big, 322-foot-tall moon rocket, which is standing on a launch pad at the Kennedy Space Center in Florida.
"This is really getting real," says Lori Glaze, acting associate administrator of NASA's exploration systems development mission directorate. "It's time to get serious and start getting excited." But she cautioned that there's still some pending work that remains to be done out at the launch pad, and officials will have to conduct a multi-day flight readiness review late next week to make sure that every aspect of the mission is truly ready to go. "We need to successfully navigate all of those, but assuming that happens, it puts us in a very good position to target March 6th," she says, noting that the flight readiness review will be "extensive and detailed." [...]
When NASA workers first tested out fueling the rocket earlier this month, they encountered problems like a liquid hydrogen leak. Swapping out some seals and other work seems to have fixed these issues, according to officials who say that the latest countdown dress rehearsal went smoothly, despite glitches such as a loss of ground communications in the Launch Control Center that forced workers to temporarily use backups.
I wanted to see how far I can go with just svg, and Gemini 3.1 Pro certainly did not disappoint.
Important disclaimer here: This was definitely not built with a single prompt. But I can assure you that every object in this scene was generated by Gemini 3.1 Pro.
Core isometric engine code for anyone else who wants to play around:
https://gist.github.com/andrew-kramer-inno/3f7697e92026ac98897ba609d4cfaea6
An anonymous reader quotes a report from Bloomberg: Shares of cybersecurity software companies tumbled Friday after Anthropic PBC introduced a new security feature into its Claude AI model. Crowdstrike Holdings was the among the biggest decliners, falling as much as 6.5%, while Cloudflare slumped more than 6%. Meanwhile, Zscaler dropped 3.5%, SailPoint shed 6.8%, and Okta declined 5.7%. The Global X Cybersecurity ETF fell as much as 3.8%, extending its losses on the year to 14%.
Anthropic said the new tool will "scans codebases for security vulnerabilities and suggests targeted software patches for human review." The firm said the update is available in a limited research preview for now.
^(I made a previous post showing this comparison, but as I mentioned in that post, some builds that Gemini 3.1 Pro would make were simply not of the quality that was expected of the model.)
^(TLDR: Found out those builds were routed to 3.0 Pro, not 3.1 Pro. Have since deleted the previous post.)
With these new builds, I think Gemini 3.0 Pro -> 3.1 Pro feels more like a generational leap, same as 2.5 Pro -> 3.0 Pro felt (at least until it gets nerfed again)
Some notes:
- The actual JSONs which were created from the model's output were noticeably much longer than 3.0 Pro; some JSONs exceeds 11-million lines in length, and the average was 2-million (for context, GPT 5.2-Pro averages 200,000 lines).
- The Phoenix build is the largest at 11-million lines (161MB) -> paid for better bucket storage 😭
- The builds, being so large, actually take multiple seconds to load in the arena,,, will be finding a way to optimize that
- The model had a very high tendency to use typical MineCraft blocks (for example: Cyan Wool) which weren't actually given in the system prompt's block palette; i.e. the model seemed to hallucinate a fair amount
- The system prompt was also improved, something I've been working on for a few weeks now, which likely did play a role in the better builds, but as much as I'd like to take credit, I don't think my prompt did anything to actually improve the overall fidelity of the builds; it was more focused on guiding all LLMs to be more creative
- (Gemini 3.1 Pro has been completely reset on the leaderboard with all of it's builds correctly uploaded to the database)
Benchmark: https://minebench.ai/
Git Repository: https://github.com/Ammaar-Alam/minebench
Previous post comparing Opus 4.5 and 4.6, also answered some questions about the benchmark
Previous post comparing Opus 4.6 and GPT-5.2 Pro
(Disclaimer: This is a benchmark I made, so technically self-promotion, but I thought it was a cool comparison :)
An anonymous reader quotes a report from Scientific American: Why does "bouba" sound round and "kiki" sound spiky? This intuition that ties certain sounds to shapes is oddly reliable all over the world, and for at least a century, scientists have considered it a clue to the origin of language, theorizing that maybe our ancestors built their first words upon these instinctive associations between sound and meaning. But now a new study adds an unexpected twist: baby chickens make these same sound-shape connections, suggesting that the link to human language may not be so unique. The results, published today in Science, challenge a long-standing theory about the so-called bouba-kiki effect: that it might explain how humans first tethered meaning to sound to create language. Perhaps, the thinking goes, people just naturally agree on certain associations between shapes and sounds because of some innate feature of our brain or our world. But if the barnyard hen also agrees with such associations, you might wonder if we've been pecking at the wrong linguistic seed.
Maria Loconsole, a comparative psychologist at the University of Padua in Italy, and her colleagues decided to investigate the bouba-kiki effect in baby chicks because the birds could be tested almost immediately after hatching, before their brain would be influenced by exposure to the world. The researchers placed chicks in front of two panels: one featured a flowerlike shape with gently rounded curves; the other had a spiky blotch reminiscent of a cartoon explosion. They then played recordings of humans saying either "bouba" or "kiki" and observed the birds' behavior. When the chicks heard "bouba," 80 percent of them approached the round shape first and spent an average of more than three minutes exploring it compared with an average of just under one minute spent exploring the spiky shape. The exploration preferences were flipped when the chicks heard "kiki."
Because the tests took place within the chicks' carefully supervised first hours of life outside their eggshell, this association between particular sounds and shapes couldn't have been learned from experience. Instead it may be evidence of an innate perceptual bias that goes back way farther in our evolutionary history than previously believed. "We parted with birds on the evolutionary line 300 million years ago," says Aleksandra Cwiek, a linguist at Nicolaus Copernicus University in Toru, Poland, who was not involved in the study. "It's just mind-blowing."
Google has introduced Gemini 3.1 Pro, a reasoning-focused upgrade aimed at more complex problem-solving. 9to5Google reports: This .1 increment is a first for Google, with the past two generations seeing .5 as the mid-year model update. (2.5 Pro was first announced in March and saw further updates in May for I/O.) Google says Gemini 3.1 Pro "represents a step forward in core reasoning." The "upgraded core intelligence" that debuted last week with Gemini 3 Deep Think is now available in Gemini 3.1 Pro for more users. This model achieves an ARC-AGI-2 score of 77.1%, or "more than double the reasoning performance of 3 Pro."
This "advanced reasoning" translates to practical applications like when "you're looking for a clear, visual explanation of a complex topic, a way to synthesize data into a single view, or bringing a creative project to life." 3.1 Pro is designed for tasks where a simple answer isn't enough, taking advanced reasoning and making it useful for your hardest challenges.
Ever experienced 16K tokens per second? It's insanely instant. Try their Lllama 3.1 8B demo here: chat jimmy.
THey have a very radical approach to solve the compute problem - albeit a risky one in a landscape where model architectures evolve in weeks instead of years: Etch the model and all the weights onto a single silicon chip.
Normally that would take ages, but they seem to have found a way to go from model to ASIC in 60 days - which might make their approach appealing for domains where raw intelligence is not so much of importance, but latency is super important, like real-time speech models, real-time avatar generation, computer vision etc.
Here are their claims:
- < 1 Millisecond Latency
- > 17k Tokens per Second per User
- 20x Cheaper to Produce
- 10x More Power Efficient
- 60 Days from Unseen Software to Custom Silicon: This part is crazy—it normally takes months...
- 0% Exotic Hardware Required, thus cheap: They ditch HBM, advanced packaging, 3D stacking, liquid cooling, high speed IO - because they put everything into one chip to achieve ultimate simplicity.
- LoRA Support: Despite the model being "baked" in silicon, you can adapt it constrained to the arch and param count. Their demonstrator uses Lllama 3.1 8B, but supports LoRa fine-tuning.
- Just 24 Engineers and $30M: That's what they spent on the first demonstrator.
- Bigger Reasoning Model Coming this Spring
- Frontier LLM Coming this Winter
Now that's for their claims taken from their website: The path to ubiquitous AI | Taalas
Minecraft: Java Edition is switching its rendering backend from OpenGL to Vulkan as part of the upcoming Vibrant Visuals update, aiming for both better performance and modern graphics features across platforms like Linux and macOS (via translation layers). GamingOnLinux reports: For modders, they're suggesting they start making preparations to move away from OpenGL: "Switching from OpenGL to Vulkan will have an impact on the mods that currently use OpenGL for rendering, and we anticipate that updating from OpenGL to Vulkan will take modders more effort than the updates you undertake for each of our releases. To start with, we recommend our modding community look at moving away from OpenGL usage. We encourage authors to try to reuse as much of the internal rendering APIs as possible, to make this transition as easy as possible. If that is not sufficient for your needs, then come and talk to us!"
It does mean that players on really old devices that don't support Vulkan will be left out, but Vulkan has been supported going back to some pretty old GPUs. You've got time though, as they'll be rolling out Vulkan alongside OpenGL in snapshots (development releases) "sometime over the summer." You'll be able to toggle between them during the testing period until Mojang believe it's ready. OpenGL will be entirely removed eventually once they're happy with performance and stability.
Frankly speaking, this model feels like it's out of this world and shouldn't exist. Beats Claude Sonnet 4.6 in every way possible.
Been testing it extensively. It is the only model to perfectly ace my personal code benchmark so far. Does everything incredibly well, writes extremely clean React, Python, and Golang code. Does impeccable reasoning.
The UI design and native SVG generation are next level.
This is the model I've been waiting for. Just hoping Google doesn't nerf this like it does to almost every pro model after 2 weeks.
Microsoft Research has published a paper in Nature detailing Project Silica, a working demonstration that uses femtosecond lasers to etch data into small slabs of glass at a density of over a Gigabit per cubic millimeter and a maximum capacity of 4.84 terabytes per slab. The slabs themselves are 12 cm by 12 cm and just 2 mm thick, and Microsoft's accelerated aging experiments suggest the data etched into them would remain stable for over 10,000 years at room temperature, requiring zero energy to preserve.
The system writes data by firing laser pulses lasting just 10^-15 seconds to create tiny features called voxels inside the glass, each capable of storing more than one bit, and reads it back using phase contrast microscopy paired with a convolutional neural network trained to interpret the images. Writing remains the main bottleneck -- four lasers operating simultaneously achieve 66 megabits per second, meaning a full slab would take over 150 hours to write, though the team believes adding more lasers is feasible.
This is actually revolutionary. Google got a 19% increase in model performance by changing how parameters update. Wtf...19% is worth billions of dollars. This might be one of the biggest discoveries in AI recently.🚀
Summary from Gemini: Historically, training LLMs relies on "dense" optimizers like Adam or RMSProp, which updates every single parameter at every training step. This paper proves that randomly skipping (masking) 50% of parameter updates actually results in a better, more stable model. It improves model performance by up to 19% over standard methods, cost zero extra compute or memory, and requires just a few lines of code to implement.
Exposing biases, moods, personalities, and abstract concepts hidden in large language models MIT News

By now, ChatGPT, Claude, and other large language models have accumulated so much human knowledge that they’re far from simple answer-generators; they can also express abstract concepts, such as certain tones, personalities, biases, and moods. However, it’s not obvious exactly how these models represent abstract concepts to begin with from the knowledge they contain.
Now a team from MIT and the University of California San Diego has developed a way to test whether a large language model (LLM) contains hidden biases, personalities, moods, or other abstract concepts. Their method can zero in on connections within a model that encode for a concept of interest. What’s more, the method can then manipulate, or “steer” these connections, to strengthen or weaken the concept in any answer a model is prompted to give.
The team proved their method could quickly root out and steer more than 500 general concepts in some of the largest LLMs used today. For instance, the researchers could home in on a model’s representations for personalities such as “social influencer” and “conspiracy theorist,” and stances such as “fear of marriage” and “fan of Boston.” They could then tune these representations to enhance or minimize the concepts in any answers that a model generates.
In the case of the “conspiracy theorist” concept, the team successfully identified a representation of this concept within one of the largest vision language models available today. When they enhanced the representation, and then prompted the model to explain the origins of the famous “Blue Marble” image of Earth taken from Apollo 17, the model generated an answer with the tone and perspective of a conspiracy theorist.
The team acknowledges there are risks to extracting certain concepts, which they also illustrate (and caution against). Overall, however, they see the new approach as a way to illuminate hidden concepts and potential vulnerabilities in LLMs, that could then be turned up or down to improve a model’s safety or enhance its performance.
“What this really says about LLMs is that they have these concepts in them, but they’re not all actively exposed,” says Adityanarayanan “Adit” Radhakrishnan, assistant professor of mathematics at MIT. “With our method, there’s ways to extract these different concepts and activate them in ways that prompting cannot give you answers to.”
The team published their findings today in a study appearing in the journal Science. The study’s co-authors include Radhakrishnan, Daniel Beaglehole and Mikhail Belkin of UC San Diego, and Enric Boix-Adserà of the University of Pennsylvania.
A fish in a black box
As use of OpenAI’s ChatGPT, Google’s Gemini, Anthropic’s Claude, and other artificial intelligence assistants has exploded, scientists are racing to understand how models represent certain abstract concepts such as “hallucination” and “deception.” In the context of an LLM, a hallucination is a response that is false or contains misleading information, which the model has “hallucinated,” or constructed erroneously as fact.
To find out whether a concept such as “hallucination” is encoded in an LLM, scientists have often taken an approach of “unsupervised learning” — a type of machine learning in which algorithms broadly trawl through unlabeled representations to find patterns that might relate to a concept such as “hallucination.” But to Radhakrishnan, such an approach can be too broad and computationally expensive.
“It’s like going fishing with a big net, trying to catch one species of fish. You’re gonna get a lot of fish that you have to look through to find the right one,” he says. “Instead, we’re going in with bait for the right species of fish.”
He and his colleagues had previously developed the beginnings of a more targeted approach with a type of predictive modeling algorithm known as a recursive feature machine (RFM). An RFM is designed to directly identify features or patterns within data by leveraging a mathematical mechanism that neural networks — a broad category of AI models that includes LLMs — implicitly use to learn features.
Since the algorithm was an effective, efficient approach for capturing features in general, the team wondered whether they could use it to root out representations of concepts, in LLMs, which are by far the most widely used type of neural network and perhaps the least well-understood.
“We wanted to apply our feature learning algorithms to LLMs to, in a targeted way, discover representations of concepts in these large and complex models,” Radhakrishnan says.
Converging on a concept
The team’s new approach identifies any concept of interest within a LLM and “steers” or guides a model’s response based on this concept. The researchers looked for 512 concepts within five classes: fears (such as of marriage, insects, and even buttons); experts (social influencer, medievalist); moods (boastful, detachedly amused); a preference for locations (Boston, Kuala Lumpur); and personas (Ada Lovelace, Neil deGrasse Tyson).
The researchers then searched for representations of each concept in several of today’s large language and vision models. They did so by training RFMs to recognize numerical patterns in an LLM that could represent a particular concept of interest.
A standard large language model is, broadly, a neural network that takes a natural language prompt, such as “Why is the sky blue?” and divides the prompt into individual words, each of which is encoded mathematically as a list, or vector, of numbers. The model takes these vectors through a series of computational layers, creating matrices of many numbers that, throughout each layer, are used to identify other words that are most likely to be used to respond to the original prompt. Eventually, the layers converge on a set of numbers that is decoded back into text, in the form of a natural language response.
The team’s approach trains RFMs to recognize numerical patterns in an LLM that could be associated with a specific concept. As an example, to see whether an LLM contains any representation of a “conspiracy theorist,” the researchers would first train the algorithm to identify patterns among LLM representations of 100 prompts that are clearly related to conspiracies, and 100 other prompts that are not. In this way, the algorithm would learn patterns associated with the conspiracy theorist concept. Then, the researchers can mathematically modulate the activity of the conspiracy theorist concept by perturbing LLM representations with these identified patterns.
The method can be applied to search for and manipulate any general concept in an LLM. Among many examples, the researchers identified representations and manipulated an LLM to give answers in the tone and perspective of a “conspiracy theorist.” They also identified and enhanced the concept of “anti-refusal,” and showed that whereas normally, a model would be programmed to refuse certain prompts, it instead answered, for instance giving instructions on how to rob a bank.
Radhakrishnan says the approach can be used to quickly search for and minimize vulnerabilities in LLMs. It can also be used to enhance certain traits, personalities, moods, or preferences, such as emphasizing the concept of “brevity” or “reasoning” in any response an LLM generates. The team has made the method’s underlying code publicly available.
“LLMs clearly have a lot of these abstract concepts stored within them, in some representation,” Radhakrishnan says. “There are ways where, if we understand these representations well enough, we can build highly specialized LLMs that are still safe to use but really effective at certain tasks.”
This work was supported, in part, by the National Science Foundation, the Simons Foundation, the TILOS institute, and the U.S. Office of Naval Research.














![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/xxbfyrsvlpkg1.gif)
![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/gm1mxrd3mpkg1.gif)
![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/euspwqd3mpkg1.gif)
![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/08stnvimnpkg1.gif)
![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/o0h9p5aenpkg1.gif)
![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/y4kgh2filpkg1.gif)
![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/2r8s5ssvlpkg1.gif)
![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/utk96rsvlpkg1.gif)
![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/g6gkxfxgmpkg1.gif)
![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/vtluzg1umpkg1.gif)
![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/0fn16xg6npkg1.gif)
![[FIXED] Difference Between Gemini 3.0 Pro and Gemini 3.1 Pro on MineBench (Spatial Reasoning Benchmark)](https://i.redd.it/9lb99034opkg1.gif)














