brianray/gist:56f1dc33413274ddb4e4a148d6706b81

## gistfile1.md

      
    Raw
  

              gistfile1.md
            
          
    Scenarios

Jailbreaking for Data Leaked from Hacker Actor: the hacker gets data out of the model by pure prompt manipulation,
Intrusion from Hacker Actor: The malicious instructions pass through the application and can instruct the extension services to do bad things.
Data poisoning for Training data: The source data used to train the LLM has malicious content before it is trained.
Prompt Poising: hidden content or injected content unintentional from an innocent bystander.


      graph RL

    A1(Active Hacker Actor) -.1 melicious prompt.-> C1[LLM]
    subgraph sub1["1. Jailbreaking for Data Leaked from Hacker Actor"]
        C1[LLM]
    end
    C1[LLM] -.2 unauthrized data.-> A1(Hacker Actor)
    
    A2(Active Hacker Actor) -.1 melicious.->  C2[LLM]
    subgraph sub2["2. Intrusion from Hacker Actor"]
        C2[LLM] -.2 melicious instructions.-> D2[Extension Services] 
    end
    
    A3(Passive Hacker Actor) -.1 insert bad data.-> E3[Data Store]
    D3(User Actor) -.3 good prompt.-> C3[LLM] 
    E3[Data Store] -.2 training.-> C3[LLM]
    C3[LLM] -.4 bad response.-> D3(User Actor)
    subgraph sub3["3. Data poisoning for Training data"]
        E3[Data Store]
        C3[LLM]
    end

    
    D4(User Actor) -.1 cut .-> E4[data or source repository]
    D4(User Actor) -.3 prompt .-> C4[LLM]
    C4[LLM] -.4 bad content .-> D4(User Actor)
    E4[data or source repository] -.2 paste .-> D4(User Actor) 
    subgraph sub4["4. Prompt Poising"]
        C4[LLM]
        
    end
    

      Loading