Skip to content

Instantly share code, notes, and snippets.

@digantamisra98
Created August 12, 2019 12:58
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save digantamisra98/35ca0ec94ebefb99af6f444922fa52cd to your computer and use it in GitHub Desktop.
Save digantamisra98/35ca0ec94ebefb99af6f444922fa52cd to your computer and use it in GitHub Desktop.
Mish Class Definition in Keras
# Keras Implementation of Mish Activation Function.
# Import Necessary Modules.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from keras.engine.base_layer import Layer
from keras import backend as K
class Mish(Layer):
'''
Mish Activation Function.
.. math::
mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + e^{x}))
Shape:
- Input: Arbitrary. Use the keyword argument `input_shape`
(tuple of integers, does not include the samples axis)
when using this layer as the first layer in a model.
- Output: Same shape as the input.
Examples:
>>> X_input = Input(input_shape)
>>> X = Mish()(X_input)
'''
def __init__(self, **kwargs):
super(Mish, self).__init__(**kwargs)
self.supports_masking = True
def call(self, inputs):
return inputs * K.tanh(K.softplus(inputs))
def get_config(self):
base_config = super(Mish, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
def compute_output_shape(self, input_shape):
return input_shape
@dsadulla
Copy link

dsadulla commented Mar 5, 2020

@digantamisra98 I was trying to use this in tf.keras.. keeps saying that it cannot find config on L36
Since this is an activation function, with no arguments except the input, is the config an empty dictionary? I was able to resolve it by changing the get_config method in the following way.. Do you see any issues with this?

    def get_config(self):
        config = super(Mish, self).get_config()
        return config

The complete implementation looks as follows:


# Keras Implementation of Mish Activation Function.


# Import Necessary Modules.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from tensorflow.keras.layers import Layer
from tensorflow.keras import backend as K

class Mish(Layer):
    '''
    Mish Activation Function.
    .. math::
        mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + e^{x}))
    Shape:
        - Input: Arbitrary. Use the keyword argument `input_shape`
        (tuple of integers, does not include the samples axis)
        when using this layer as the first layer in a model.
        - Output: Same shape as the input.
    Examples:
        >>> X_input = Input(input_shape)
        >>> X = Mish()(X_input)
    '''

    def __init__(self, **kwargs):
        super(Mish, self).__init__(**kwargs)
        self.supports_masking = True

    def call(self, inputs):
        return inputs * K.tanh(K.softplus(inputs))

    def get_config(self):
        config = super(Mish, self).get_config()
        return config

    def compute_output_shape(self, input_shape):
        return input_shape

@digantamisra98
Copy link
Author

@dsadulla yes, that would be fine.The config dictionary was originally written to accept a Beta parameter where the negative concavity of Mish can be scaled which makes the formula as: x * tanh(softplus(x + beta))

@EscVM
Copy link

EscVM commented Jun 1, 2020

This works fine @dsadulla
Screenshot 2020-06-01 at 11 15 32

@rupshali
Copy link

rupshali commented Oct 5, 2020

What keyword do I use if I use this as my activation function to train my cnn model?
The line of code that I want to write is: model.add(Dense(50,activation='mish'))
But this is showing me an error saying: init() takes 1 positional argument but 2 were given
Please help

@digantamisra98
Copy link
Author

@rupshali You can define mish as a function instead rather than as a Layer and use it within any keras layers supporting activations with the function name.
For example:
Defining Mish as a function -

## Mish Activation Function
def mish(x):
	return tf.keras.layers.Lambda(lambda x: x*tf.tanh(tf.log(1+tf.exp(x))))(x)

Defining a network with Mish activations:

##LeNet Architecture
model = Sequential()
model.add(Conv2D(20, 5, padding="same",input_shape=inputShape, activation = mish))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

model.add(Conv2D(50, 5, padding="same",activation = mish ))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

model.add(Flatten())
model.add(Dense(500, activation = mish))


model.add(Dense(numClasses))

model.add(Activation("softmax"))
model.summary()

Hope this helps.

@rupshali
Copy link

rupshali commented Oct 6, 2020

Thanks a lot @digantamisra98

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment