Top 5 This Week

Related Posts

What open-source AI models should your enterprise use? Endor Labs analyzes them all


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


AI development is akin to the early wild west days of open source โ€” models are being built on top of each other, cobbled together with different elements from different places.ย 

And, much like with open-source software, this presents problems when it comes to visibility and security: How can developers know that the foundational elements of pre-built models are trustworthy, secure and reliable?

To provide more of a nuts-and-bolts picture of AI models, software supply chain security company Endor Labs is today releasing Endor Labs Scores for AI Models. The new platform scores the more than 900,000 open-source AI models currently available on Hugging Face, one of the worldโ€™s most popular AI hubs.ย 

โ€œDefinitely weโ€™re at the beginning, the early stages,โ€ George Apostolopoulos, founding engineer at Endor Labs, told VentureBeat. โ€œThereโ€™s a huge challenge when it comes to the black box of models; itโ€™s risky to download binary code from the internet.โ€

Scoring on four critical factors

Endor Labsโ€™ new platform uses 50 out-of-the-box metrics that score models on Hugging Face based on security, activity, quality and popularity. Developers donโ€™t have to have intimate knowledge of specific models โ€” they can prompt the platform with questions such as โ€œWhat models can classify sentiments?โ€ โ€œWhat are Metaโ€™s most popular models?โ€ or โ€œWhat is a popular voice model?โ€

Courtesy Endor Labs.

The platform then tells developers how popular and secure models are and how recently they were created and updated.ย 

Apostolopoulos called security in AI models โ€œcomplex and interesting.โ€ There are numerous vulnerabilities and risks, and models are susceptible to malicious code injection, typosquatting and compromised user credentials anywhere along the line.ย 

โ€œItโ€™s only a matter of time as these things become more widespread, we will see attackers all over the place,โ€ said Apostolopoulos. โ€œThere are so many attack vectors, itโ€™s difficult to gain confidence. Itโ€™s important to have visibility.โ€

Endor โ€”which specializes in securing open-source dependencies โ€” developed the four scoring categories based on Hugging Face data and literature on known attacks. The company has deployed LLMs that parse, organize and analyze that data, and the companyโ€™s new platform automatically and continuously scans for model updates or alterations.ย 

Apostolopoulos said additional factors will be taken into account as Endor collects more data. The company will also eventually expand to other platforms beyond Hugging Face, such as commercial providers including OpenAI.ย 

โ€œWe will have a bigger story about the governance of AI, which is becoming important as more people start deploying it,โ€ said Apostolopoulos.ย 

AI on a similar path as open-source development โ€” but itโ€™s much more complicated

There are many parallels between the development of AI and the development of open-source software (OSS), Apostolopoulos pointed out. Both have a multitude of options โ€” as well as numerous risks. With OSS, software packages can introduce indirect dependencies that hide vulnerabilities.ย 

Similarly, the vast majority of models on Hugging Face are based on Llama or other open source options. โ€œThese AI models are pretty much dependencies,โ€ said Apostolopoulos.ย 

AI models are typically built on, or are essentially extensions of, other models, with developers fine-tuning to their specific use cases. This creates what he described as a โ€œcomplex dependency graphโ€ that is difficult to both manage and secure.

โ€œAt the bottom somewhere, five layers deep, there is this foundation model,โ€ said Apostolopoulos. Getting clarity and transparency can be difficult, and the data that is available can be convoluted and โ€œquite painfulโ€ for people to read and understand. Itโ€™s hard to determine what exactly is contained in model weights, and there are no lithographic ways to ensure that a model is what it claims to be, is trustworthy, as advertised and that it doesnโ€™t produce toxic content.ย 

โ€œBasic testing is not something that can be done lightly or easily,โ€ said Apostolopoulos. โ€œThe reality is there is very little and very fragmented information.โ€

While itโ€™s convenient to download open source, itโ€™s also โ€œextremely dangerous,โ€ as malicious actors can easily compromise it, he said.ย 

For instance, common storing formats for model weights can allow arbitrary code execution (Or when an attacker can gain access and run any commands or code that they please). This can be particularly dangerous for models built on older formats such as PyTorch, Tensorflow and Keras, Apostolopoulos explained. Also, deploying models may require downloading other code that is malicious or vulnerable (or that can attempt to import dependencies that are). And, installation scripts or repositories (as well as links to them) can be malicious.ย 

Beyond security, there are numerous licensing obstacles, too: Similar to open-source, models are governed by licenses, but AI introduces new complications because models are trained on datasets that have their own licenses. Todayโ€™s organizations must be aware of intellectual property (IP) used by models as well as copyright terms, Apostolopoulos emphasized.ย 

โ€œOne important aspect is how similar and different these LLMs are from traditional open source dependencies,โ€ he said. While they both pull in outside sources, LLMs are more powerful, larger and made up of binary data.ย 

Open-source dependencies get โ€œupdates and updates and updates,โ€ while AI models are โ€œfairly staticโ€ โ€” when theyโ€™re updated, โ€œyou most likely wonโ€™t touch them again,โ€ said Apostolopoulos.ย 

โ€œLLMs are just a bunch of numbers,โ€ he said. โ€œTheyโ€™re much more complex to evaluate.โ€ย 

#opensource #models #enterprise #Endor #Labs #analyzes
source: https://venturebeat.com/security/what-open-source-ai-models-should-your-enterprise-use-endor-labs-analyzes-them-all/

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Popular Articles