Skip to content

Instantly share code, notes, and snippets.

@lumpenspace
Last active December 25, 2023 17:01
Show Gist options
  • Save lumpenspace/1d058374a36906b2e2f4c2cd5963d8a3 to your computer and use it in GitHub Desktop.
Save lumpenspace/1d058374a36906b2e2f4c2cd5963d8a3 to your computer and use it in GitHub Desktop.
graphstuph

Dijkstra's Eigenvector monster thing

Closeness measure for highly clustered, highly variable networks

Lumpenspace and ChatGPT, 2023

Abstract

Look you can use your own OpenAI credits

Components

  • Intensity and Type of Interaction (Direct Connection):

    • Each interaction type is given a time-weighted value.
    • In the case of twitter, you can use the same they use for timeline boost.
    • $( I(u, v) )$ is the weighted interacrtion sum over time between $( u )$ and $( v )$.
  • Dijkstra's Eigenvector monster

    • Define a threshold value $( T )$ representing the strength relative to a direct connection we'd still care about.
    • Look for progressively farther connections until $t ( T \cdot I(u, v) )$ is satisfied,
  • Let $( u )$ and $( v )$ be two users in the network.

  • Let $( I(u, v) )$ represent the intensity and type of direct interaction between $( u )$ and $( v )$.

Proposed Iterative Formula:

Let $( C_k(u, v) )$ be the conrtribution from kth-degree connections. Initially, $( C_1(u, v) )$ 1st deg, $( C_2(u, v) )$ 2nd yada yada.

Thus for any $( k )$-th degree we have:

$$[ C_k(u, v) = \gamma^k \cdot \sum_{y \in N_k(u)} \frac{I(u, y) \cdot I(y, v)}{C(y) \cdot \alpha^k} ]$$

Where:

  • $( \gamma )$ is a degree decay factor.
  • $( \alpha )$ decay factor for the contribution of each connection based on the degree.
  • $( N_k(u) )$ set of k-th degree connections of user ( u ).

Continuation Condition:

$$[ \text{Continue if } \sum_{j=1}^{k} C_j(u, v) < T \cdot I(u, v) ]$$

where $( T )$ is a threshold value representing the strength relative to a direct connection we'd still care about.

Final Closeness Measure:

$$[ C(u, v) = I(u, v) + \sum_{j=1}^{k} C_j(u, v) ]$$

where $(k)$ is the smallest value for which the continuation condition is not satisfied.

And now, a word from our friends at OpenAI:

(query was: evaluate and share feasibility for the measure of closeness between two users in a highly clustered and highly variable feed/follow network like Twitter.)

The concept you've outlined, integrating iterative inclusion of higher-degree connections with a threshold for diminishing returns, is theoretically sound and aligns well with network theory and social network analysis principles. It's a sophisticated approach to measuring the closeness between individuals in a highly clustered and variable network like Twitter. Here's a breakdown of its feasibility and some considerations:

Feasibility and Considerations:

  1. Intensity and Type of Interaction (I(u, v)):

    • Feasibility: Quite high. Social networks often track various interactions that can be quantified and weighted.
    • Consideration: The challenge lies in accurately weighting different types of interactions and ensuring that the data reflects actual relationship strength.
  2. Dijkstra's Eigenvector Monster:

    • Feasibility: Moderate. The approach is computationally intensive but manageable with efficient algorithms and sufficient resources.
    • Consideration: The primary concern would be computational complexity and runtime, especially in a large and dynamic network like Twitter. Ensuring that the algorithm is efficient and scalable is crucial.
  3. Proposed Iterative Formula (C_k(u, v)):

    • Feasibility: Moderate to high. The iterative approach is a standard method in network analysis, and while it increases computational load, it's generally feasible with good algorithm design.
    • Consideration: Balancing the depth of iterations (degree of connections) with computational efficiency will be vital. There might be a need to cap the degree to which the network is explored or to sample connections in extremely large networks.
  4. Continuation Condition:

    • Feasibility: High. It's a straightforward conditional check within the iterative process.
    • Consideration: The choice of threshold ( T ) is crucial and might need to be empirically determined or adjustable based on the network's characteristics or the desired sensitivity.
  5. Final Closeness Measure (C(u, v)):

    • Feasibility: High. Once the iterative process and thresholds are defined, accumulating the measure is straightforward.
    • Consideration: Ensuring that the final measure is normalized or scaled appropriately for interpretability and comparison across different user pairs.

Practical Implementation:

  • Optimization: The algorithm must be optimized for performance, given the potential size of real-world social networks.
  • Dynamic Nature: Social networks are constantly changing, and the algorithm must be able to update measurements efficiently as new data comes in.
  • Validation: The proposed metrics and weights should be validated against known relationships or benchmarked using established network measures to ensure they reflect the intended closeness or relationship strength.

Conclusion:

Your approach to calculating closeness in highly clustered and variable networks like Twitter is well-founded in network theory and could provide a nuanced understanding of user relationships. However, practical implementation would require careful consideration of computational efficiency, data handling, and algorithm optimization. With those considerations in mind, and potentially after some empirical tuning and validation, this methodology could indeed work to provide valuable insights into the structure and dynamics of social networks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment